Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-09 Thread joel



On August 8, 2018 6:47:16 PM EDT, "Paul E. McKenney" 
 wrote:
>On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote:
>> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
>>  wrote:
>> [...]
>> >> >> >> It does start to seem like a show stopper :-(
>> >> >> >
>> >> >> > I suppose that an srcu_read_lock_nmi() and
>srcu_read_unlock_nmi() could
>> >> >> > be added, which would do atomic ops on
>sp->sda->srcu_lock_count.  Not sure
>> >> >> > whether this would be fast enough to be useful, but easy to
>provide:
>> >> >> >
>> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /*
>UNTESTED. */
>> >> >> > {
>> >> >> > int idx;
>> >> >> >
>> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> >> >> > atomic_inc(>sda->srcu_lock_count[idx]);
>> >> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking
>critical section. */
>> >> >> > return idx;
>> >> >> > }
>> >> >> >
>> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> >> >> > {
>> >> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking
>critical section. */
>> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> >> >> > }
>> >> >> >
>> >> >> > With appropriate adjustments to also allow Tiny RCU to also
>work.
>> >> >> >
>> >> >> > Note that you have to use _nmi() everywhere, not just in NMI
>handlers.
>> >> >> > In fact, the NMI handlers are the one place you -don't- need
>to use
>> >> >> > _nmi(), strangely enough.
>> >> >> >
>> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a
>no-op on
>> >> >> > some architectures, for example.
>> >> >>
>> >> >> Continuing Steve's question on regular interrupts, do we need
>to use
>> >> >> this atomic_inc API for regular interrupts as well? So I guess
>> >> >
>> >> > If NMIs use one srcu_struct and non-NMI uses another, the
>current
>> >> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If
>any given
>> >> > srcu_struct needs both NMI and non-NMI readers, then we really
>do need
>> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that
>srcu_struct.
>> >>
>> >> Yes, I believe as long as in_nmi() works reliably, we can use the
>> >> right srcu_struct (NMI vs non-NMI) and it would be fine.
>> >>
>> >> Going through this thread, it sounds though that this_cpu_inc may
>not
>> >> be reliable on all architectures even for non-NMI interrupts and
>> >> local_inc may be the way to go.
>> >
>> > My understanding is that this_cpu_inc() is defined to handle
>interrupts,
>> > so any architecture on which it is unreliable needs to fix its bug.
> ;-)
>> 
>> Yes that's my understanding as well.
>> 
>> Then may be I'm missing something about yours/Steve's conversations
>in
>> the morning, about why we need bother with the local_inc then. So the
>> current SRCU code with the separate NMI handle should work fine (for
>> future merge windows) as long as we're using a separate srcu_struct
>> for NMI. :-)
>
>I do believe that to be true.  But only as long as that separate
>srcu_struct is -only- used for NMI.
>
>If this is what you have been pushing for all along, please accept my
>apologies for my being slow!
>

That's ok, sorry I initially didn't describe it well which may have caused 
confusion, but yes that's what I was pushing for.

>That said, your approach does require you to have a perfect way to
>distinguish between NMI and not-NMI.  If the distinguishing is even
>in the slightest imperfect, then some sort of NMI-safe SRCU reader
>approach is of course required.
>

Thanks Paul, agreed with everything and we are on the same page.

- Joel



>   Thanx, Paul
>
>> >> For next merge window (not this one), lets do that then? Paul, if
>you
>> >> could provide me an SRCU API that uses local_inc, then I believe
>that
>> >> coupled with this patch should be all that's needed:
>> >> https://lore.kernel.org/patchwork/patch/972657/
>> >>
>> >> Steve did express concern though if in_nmi() works reliably (i.e.
>> >> tracepoint doesn't fire from "thunk" code before in_nmi() is
>> >> available). Any thoughts on that Steve?
>> >
>> > Agreed, not the upcoming merge window.  But we do need to work out
>> > exactly what is the right way to do this.
>> 
>> Agreed, thanks!
>> 
>>  - Joel
>> 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-09 Thread joel



On August 8, 2018 6:47:16 PM EDT, "Paul E. McKenney" 
 wrote:
>On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote:
>> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
>>  wrote:
>> [...]
>> >> >> >> It does start to seem like a show stopper :-(
>> >> >> >
>> >> >> > I suppose that an srcu_read_lock_nmi() and
>srcu_read_unlock_nmi() could
>> >> >> > be added, which would do atomic ops on
>sp->sda->srcu_lock_count.  Not sure
>> >> >> > whether this would be fast enough to be useful, but easy to
>provide:
>> >> >> >
>> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /*
>UNTESTED. */
>> >> >> > {
>> >> >> > int idx;
>> >> >> >
>> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> >> >> > atomic_inc(>sda->srcu_lock_count[idx]);
>> >> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking
>critical section. */
>> >> >> > return idx;
>> >> >> > }
>> >> >> >
>> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> >> >> > {
>> >> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking
>critical section. */
>> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> >> >> > }
>> >> >> >
>> >> >> > With appropriate adjustments to also allow Tiny RCU to also
>work.
>> >> >> >
>> >> >> > Note that you have to use _nmi() everywhere, not just in NMI
>handlers.
>> >> >> > In fact, the NMI handlers are the one place you -don't- need
>to use
>> >> >> > _nmi(), strangely enough.
>> >> >> >
>> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a
>no-op on
>> >> >> > some architectures, for example.
>> >> >>
>> >> >> Continuing Steve's question on regular interrupts, do we need
>to use
>> >> >> this atomic_inc API for regular interrupts as well? So I guess
>> >> >
>> >> > If NMIs use one srcu_struct and non-NMI uses another, the
>current
>> >> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If
>any given
>> >> > srcu_struct needs both NMI and non-NMI readers, then we really
>do need
>> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that
>srcu_struct.
>> >>
>> >> Yes, I believe as long as in_nmi() works reliably, we can use the
>> >> right srcu_struct (NMI vs non-NMI) and it would be fine.
>> >>
>> >> Going through this thread, it sounds though that this_cpu_inc may
>not
>> >> be reliable on all architectures even for non-NMI interrupts and
>> >> local_inc may be the way to go.
>> >
>> > My understanding is that this_cpu_inc() is defined to handle
>interrupts,
>> > so any architecture on which it is unreliable needs to fix its bug.
> ;-)
>> 
>> Yes that's my understanding as well.
>> 
>> Then may be I'm missing something about yours/Steve's conversations
>in
>> the morning, about why we need bother with the local_inc then. So the
>> current SRCU code with the separate NMI handle should work fine (for
>> future merge windows) as long as we're using a separate srcu_struct
>> for NMI. :-)
>
>I do believe that to be true.  But only as long as that separate
>srcu_struct is -only- used for NMI.
>
>If this is what you have been pushing for all along, please accept my
>apologies for my being slow!
>

That's ok, sorry I initially didn't describe it well which may have caused 
confusion, but yes that's what I was pushing for.

>That said, your approach does require you to have a perfect way to
>distinguish between NMI and not-NMI.  If the distinguishing is even
>in the slightest imperfect, then some sort of NMI-safe SRCU reader
>approach is of course required.
>

Thanks Paul, agreed with everything and we are on the same page.

- Joel



>   Thanx, Paul
>
>> >> For next merge window (not this one), lets do that then? Paul, if
>you
>> >> could provide me an SRCU API that uses local_inc, then I believe
>that
>> >> coupled with this patch should be all that's needed:
>> >> https://lore.kernel.org/patchwork/patch/972657/
>> >>
>> >> Steve did express concern though if in_nmi() works reliably (i.e.
>> >> tracepoint doesn't fire from "thunk" code before in_nmi() is
>> >> available). Any thoughts on that Steve?
>> >
>> > Agreed, not the upcoming merge window.  But we do need to work out
>> > exactly what is the right way to do this.
>> 
>> Agreed, thanks!
>> 
>>  - Joel
>> 

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
>  wrote:
> [...]
> >> >> >> It does start to seem like a show stopper :-(
> >> >> >
> >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() 
> >> >> > could
> >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
> >> >> > sure
> >> >> > whether this would be fast enough to be useful, but easy to provide:
> >> >> >
> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> >> >> > {
> >> >> > int idx;
> >> >> >
> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> >> >> > atomic_inc(>sda->srcu_lock_count[idx]);
> >> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
> >> >> > section. */
> >> >> > return idx;
> >> >> > }
> >> >> >
> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> >> >> > {
> >> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> >> >> > section. */
> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
> >> >> > }
> >> >> >
> >> >> > With appropriate adjustments to also allow Tiny RCU to also work.
> >> >> >
> >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> >> >> > In fact, the NMI handlers are the one place you -don't- need to use
> >> >> > _nmi(), strangely enough.
> >> >> >
> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> >> >> > some architectures, for example.
> >> >>
> >> >> Continuing Steve's question on regular interrupts, do we need to use
> >> >> this atomic_inc API for regular interrupts as well? So I guess
> >> >
> >> > If NMIs use one srcu_struct and non-NMI uses another, the current
> >> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any 
> >> > given
> >> > srcu_struct needs both NMI and non-NMI readers, then we really do need
> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
> >>
> >> Yes, I believe as long as in_nmi() works reliably, we can use the
> >> right srcu_struct (NMI vs non-NMI) and it would be fine.
> >>
> >> Going through this thread, it sounds though that this_cpu_inc may not
> >> be reliable on all architectures even for non-NMI interrupts and
> >> local_inc may be the way to go.
> >
> > My understanding is that this_cpu_inc() is defined to handle interrupts,
> > so any architecture on which it is unreliable needs to fix its bug.  ;-)
> 
> Yes that's my understanding as well.
> 
> Then may be I'm missing something about yours/Steve's conversations in
> the morning, about why we need bother with the local_inc then. So the
> current SRCU code with the separate NMI handle should work fine (for
> future merge windows) as long as we're using a separate srcu_struct
> for NMI. :-)

I do believe that to be true.  But only as long as that separate
srcu_struct is -only- used for NMI.

If this is what you have been pushing for all along, please accept my
apologies for my being slow!

That said, your approach does require you to have a perfect way to
distinguish between NMI and not-NMI.  If the distinguishing is even
in the slightest imperfect, then some sort of NMI-safe SRCU reader
approach is of course required.

Thanx, Paul

> >> For next merge window (not this one), lets do that then? Paul, if you
> >> could provide me an SRCU API that uses local_inc, then I believe that
> >> coupled with this patch should be all that's needed:
> >> https://lore.kernel.org/patchwork/patch/972657/
> >>
> >> Steve did express concern though if in_nmi() works reliably (i.e.
> >> tracepoint doesn't fire from "thunk" code before in_nmi() is
> >> available). Any thoughts on that Steve?
> >
> > Agreed, not the upcoming merge window.  But we do need to work out
> > exactly what is the right way to do this.
> 
> Agreed, thanks!
> 
>  - Joel
> 



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
>  wrote:
> [...]
> >> >> >> It does start to seem like a show stopper :-(
> >> >> >
> >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() 
> >> >> > could
> >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
> >> >> > sure
> >> >> > whether this would be fast enough to be useful, but easy to provide:
> >> >> >
> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> >> >> > {
> >> >> > int idx;
> >> >> >
> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> >> >> > atomic_inc(>sda->srcu_lock_count[idx]);
> >> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
> >> >> > section. */
> >> >> > return idx;
> >> >> > }
> >> >> >
> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> >> >> > {
> >> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> >> >> > section. */
> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
> >> >> > }
> >> >> >
> >> >> > With appropriate adjustments to also allow Tiny RCU to also work.
> >> >> >
> >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> >> >> > In fact, the NMI handlers are the one place you -don't- need to use
> >> >> > _nmi(), strangely enough.
> >> >> >
> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> >> >> > some architectures, for example.
> >> >>
> >> >> Continuing Steve's question on regular interrupts, do we need to use
> >> >> this atomic_inc API for regular interrupts as well? So I guess
> >> >
> >> > If NMIs use one srcu_struct and non-NMI uses another, the current
> >> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any 
> >> > given
> >> > srcu_struct needs both NMI and non-NMI readers, then we really do need
> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
> >>
> >> Yes, I believe as long as in_nmi() works reliably, we can use the
> >> right srcu_struct (NMI vs non-NMI) and it would be fine.
> >>
> >> Going through this thread, it sounds though that this_cpu_inc may not
> >> be reliable on all architectures even for non-NMI interrupts and
> >> local_inc may be the way to go.
> >
> > My understanding is that this_cpu_inc() is defined to handle interrupts,
> > so any architecture on which it is unreliable needs to fix its bug.  ;-)
> 
> Yes that's my understanding as well.
> 
> Then may be I'm missing something about yours/Steve's conversations in
> the morning, about why we need bother with the local_inc then. So the
> current SRCU code with the separate NMI handle should work fine (for
> future merge windows) as long as we're using a separate srcu_struct
> for NMI. :-)

I do believe that to be true.  But only as long as that separate
srcu_struct is -only- used for NMI.

If this is what you have been pushing for all along, please accept my
apologies for my being slow!

That said, your approach does require you to have a perfect way to
distinguish between NMI and not-NMI.  If the distinguishing is even
in the slightest imperfect, then some sort of NMI-safe SRCU reader
approach is of course required.

Thanx, Paul

> >> For next merge window (not this one), lets do that then? Paul, if you
> >> could provide me an SRCU API that uses local_inc, then I believe that
> >> coupled with this patch should be all that's needed:
> >> https://lore.kernel.org/patchwork/patch/972657/
> >>
> >> Steve did express concern though if in_nmi() works reliably (i.e.
> >> tracepoint doesn't fire from "thunk" code before in_nmi() is
> >> available). Any thoughts on that Steve?
> >
> > Agreed, not the upcoming merge window.  But we do need to work out
> > exactly what is the right way to do this.
> 
> Agreed, thanks!
> 
>  - Joel
> 



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
 wrote:
[...]
>> >> >> It does start to seem like a show stopper :-(
>> >> >
>> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
>> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
>> >> > sure
>> >> > whether this would be fast enough to be useful, but easy to provide:
>> >> >
>> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
>> >> > {
>> >> > int idx;
>> >> >
>> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> >> > atomic_inc(>sda->srcu_lock_count[idx]);
>> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
>> >> > section. */
>> >> > return idx;
>> >> > }
>> >> >
>> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> >> > {
>> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
>> >> > section. */
>> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> >> > }
>> >> >
>> >> > With appropriate adjustments to also allow Tiny RCU to also work.
>> >> >
>> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
>> >> > In fact, the NMI handlers are the one place you -don't- need to use
>> >> > _nmi(), strangely enough.
>> >> >
>> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
>> >> > some architectures, for example.
>> >>
>> >> Continuing Steve's question on regular interrupts, do we need to use
>> >> this atomic_inc API for regular interrupts as well? So I guess
>> >
>> > If NMIs use one srcu_struct and non-NMI uses another, the current
>> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
>> > srcu_struct needs both NMI and non-NMI readers, then we really do need
>> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
>>
>> Yes, I believe as long as in_nmi() works reliably, we can use the
>> right srcu_struct (NMI vs non-NMI) and it would be fine.
>>
>> Going through this thread, it sounds though that this_cpu_inc may not
>> be reliable on all architectures even for non-NMI interrupts and
>> local_inc may be the way to go.
>
> My understanding is that this_cpu_inc() is defined to handle interrupts,
> so any architecture on which it is unreliable needs to fix its bug.  ;-)

Yes that's my understanding as well.

Then may be I'm missing something about yours/Steve's conversations in
the morning, about why we need bother with the local_inc then. So the
current SRCU code with the separate NMI handle should work fine (for
future merge windows) as long as we're using a separate srcu_struct
for NMI. :-)

>
>> For next merge window (not this one), lets do that then? Paul, if you
>> could provide me an SRCU API that uses local_inc, then I believe that
>> coupled with this patch should be all that's needed:
>> https://lore.kernel.org/patchwork/patch/972657/
>>
>> Steve did express concern though if in_nmi() works reliably (i.e.
>> tracepoint doesn't fire from "thunk" code before in_nmi() is
>> available). Any thoughts on that Steve?
>
> Agreed, not the upcoming merge window.  But we do need to work out
> exactly what is the right way to do this.

Agreed, thanks!

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney
 wrote:
[...]
>> >> >> It does start to seem like a show stopper :-(
>> >> >
>> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
>> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
>> >> > sure
>> >> > whether this would be fast enough to be useful, but easy to provide:
>> >> >
>> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
>> >> > {
>> >> > int idx;
>> >> >
>> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> >> > atomic_inc(>sda->srcu_lock_count[idx]);
>> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
>> >> > section. */
>> >> > return idx;
>> >> > }
>> >> >
>> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> >> > {
>> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
>> >> > section. */
>> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> >> > }
>> >> >
>> >> > With appropriate adjustments to also allow Tiny RCU to also work.
>> >> >
>> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
>> >> > In fact, the NMI handlers are the one place you -don't- need to use
>> >> > _nmi(), strangely enough.
>> >> >
>> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
>> >> > some architectures, for example.
>> >>
>> >> Continuing Steve's question on regular interrupts, do we need to use
>> >> this atomic_inc API for regular interrupts as well? So I guess
>> >
>> > If NMIs use one srcu_struct and non-NMI uses another, the current
>> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
>> > srcu_struct needs both NMI and non-NMI readers, then we really do need
>> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
>>
>> Yes, I believe as long as in_nmi() works reliably, we can use the
>> right srcu_struct (NMI vs non-NMI) and it would be fine.
>>
>> Going through this thread, it sounds though that this_cpu_inc may not
>> be reliable on all architectures even for non-NMI interrupts and
>> local_inc may be the way to go.
>
> My understanding is that this_cpu_inc() is defined to handle interrupts,
> so any architecture on which it is unreliable needs to fix its bug.  ;-)

Yes that's my understanding as well.

Then may be I'm missing something about yours/Steve's conversations in
the morning, about why we need bother with the local_inc then. So the
current SRCU code with the separate NMI handle should work fine (for
future merge windows) as long as we're using a separate srcu_struct
for NMI. :-)

>
>> For next merge window (not this one), lets do that then? Paul, if you
>> could provide me an SRCU API that uses local_inc, then I believe that
>> coupled with this patch should be all that's needed:
>> https://lore.kernel.org/patchwork/patch/972657/
>>
>> Steve did express concern though if in_nmi() works reliably (i.e.
>> tracepoint doesn't fire from "thunk" code before in_nmi() is
>> available). Any thoughts on that Steve?
>
> Agreed, not the upcoming merge window.  But we do need to work out
> exactly what is the right way to do this.

Agreed, thanks!

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 12:24:20PM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney
>  wrote:
> [...]
> >
> >> In that case based on what you're saying, the patch I sent to using
> >> different srcu_struct for NMI is still good I guess...
> >
> > As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
> > that there is still a use for synchronize_rcu_mult():
> >
> > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(_srcu_struct_nmi, rhp, func);
> > }
> >
> > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(_srcu_struct_nonmi, rhp, func);
> > }
> >
> > ...
> >
> > /* Wait concurrently on the two grace periods. */
> > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);
> >
> > On the other hand, I bet that doing this is just fine in your use case:
> >
> > synchronize_srcu(_srcu_struct_nmi);
> > synchronize_srcu(_srcu_struct_nonmi);
> >
> > But please note that synchronize_rcu_mult() is no more in my -rcu tree,
> > so if you do want it please let me know (and please let me know why it
> > is important).
> 
> I did the chaining thing (one callback calling another), that should
> work too right? I believe that is needed so that the tracepoint
> callbacks are freed at one point and only when both NMI and non-NMI
> read sections have completed.

Yes, that works also.  It is possible to make that happen concurrently
via atomic_dec_and_test() or similar, but if the latency is not a problem,
why bother?

> >> >> It does start to seem like a show stopper :-(
> >> >
> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
> >> > sure
> >> > whether this would be fast enough to be useful, but easy to provide:
> >> >
> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> >> > {
> >> > int idx;
> >> >
> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> >> > atomic_inc(>sda->srcu_lock_count[idx]);
> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
> >> > section. */
> >> > return idx;
> >> > }
> >> >
> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> >> > {
> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> >> > section. */
> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
> >> > }
> >> >
> >> > With appropriate adjustments to also allow Tiny RCU to also work.
> >> >
> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> >> > In fact, the NMI handlers are the one place you -don't- need to use
> >> > _nmi(), strangely enough.
> >> >
> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> >> > some architectures, for example.
> >>
> >> Continuing Steve's question on regular interrupts, do we need to use
> >> this atomic_inc API for regular interrupts as well? So I guess
> >
> > If NMIs use one srcu_struct and non-NMI uses another, the current
> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
> > srcu_struct needs both NMI and non-NMI readers, then we really do need
> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
> 
> Yes, I believe as long as in_nmi() works reliably, we can use the
> right srcu_struct (NMI vs non-NMI) and it would be fine.
> 
> Going through this thread, it sounds though that this_cpu_inc may not
> be reliable on all architectures even for non-NMI interrupts and
> local_inc may be the way to go.

My understanding is that this_cpu_inc() is defined to handle interrupts,
so any architecture on which it is unreliable needs to fix its bug.  ;-)

> For next merge window (not this one), lets do that then? Paul, if you
> could provide me an SRCU API that uses local_inc, then I believe that
> coupled with this patch should be all that's needed:
> https://lore.kernel.org/patchwork/patch/972657/
> 
> Steve did express concern though if in_nmi() works reliably (i.e.
> tracepoint doesn't fire from "thunk" code before in_nmi() is
> available). Any thoughts on that Steve?

Agreed, not the upcoming merge window.  But we do need to work out
exactly what is the right way to do this.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 12:24:20PM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney
>  wrote:
> [...]
> >
> >> In that case based on what you're saying, the patch I sent to using
> >> different srcu_struct for NMI is still good I guess...
> >
> > As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
> > that there is still a use for synchronize_rcu_mult():
> >
> > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(_srcu_struct_nmi, rhp, func);
> > }
> >
> > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
> > {
> > call_srcu(_srcu_struct_nonmi, rhp, func);
> > }
> >
> > ...
> >
> > /* Wait concurrently on the two grace periods. */
> > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);
> >
> > On the other hand, I bet that doing this is just fine in your use case:
> >
> > synchronize_srcu(_srcu_struct_nmi);
> > synchronize_srcu(_srcu_struct_nonmi);
> >
> > But please note that synchronize_rcu_mult() is no more in my -rcu tree,
> > so if you do want it please let me know (and please let me know why it
> > is important).
> 
> I did the chaining thing (one callback calling another), that should
> work too right? I believe that is needed so that the tracepoint
> callbacks are freed at one point and only when both NMI and non-NMI
> read sections have completed.

Yes, that works also.  It is possible to make that happen concurrently
via atomic_dec_and_test() or similar, but if the latency is not a problem,
why bother?

> >> >> It does start to seem like a show stopper :-(
> >> >
> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not 
> >> > sure
> >> > whether this would be fast enough to be useful, but easy to provide:
> >> >
> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> >> > {
> >> > int idx;
> >> >
> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> >> > atomic_inc(>sda->srcu_lock_count[idx]);
> >> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
> >> > section. */
> >> > return idx;
> >> > }
> >> >
> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> >> > {
> >> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> >> > section. */
> >> > atomic_inc(>sda->srcu_unlock_count[idx]);
> >> > }
> >> >
> >> > With appropriate adjustments to also allow Tiny RCU to also work.
> >> >
> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> >> > In fact, the NMI handlers are the one place you -don't- need to use
> >> > _nmi(), strangely enough.
> >> >
> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> >> > some architectures, for example.
> >>
> >> Continuing Steve's question on regular interrupts, do we need to use
> >> this atomic_inc API for regular interrupts as well? So I guess
> >
> > If NMIs use one srcu_struct and non-NMI uses another, the current
> > srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
> > srcu_struct needs both NMI and non-NMI readers, then we really do need
> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.
> 
> Yes, I believe as long as in_nmi() works reliably, we can use the
> right srcu_struct (NMI vs non-NMI) and it would be fine.
> 
> Going through this thread, it sounds though that this_cpu_inc may not
> be reliable on all architectures even for non-NMI interrupts and
> local_inc may be the way to go.

My understanding is that this_cpu_inc() is defined to handle interrupts,
so any architecture on which it is unreliable needs to fix its bug.  ;-)

> For next merge window (not this one), lets do that then? Paul, if you
> could provide me an SRCU API that uses local_inc, then I believe that
> coupled with this patch should be all that's needed:
> https://lore.kernel.org/patchwork/patch/972657/
> 
> Steve did express concern though if in_nmi() works reliably (i.e.
> tracepoint doesn't fire from "thunk" code before in_nmi() is
> available). Any thoughts on that Steve?

Agreed, not the upcoming merge window.  But we do need to work out
exactly what is the right way to do this.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney
 wrote:
[...]
>
>> In that case based on what you're saying, the patch I sent to using
>> different srcu_struct for NMI is still good I guess...
>
> As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
> that there is still a use for synchronize_rcu_mult():
>
> void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
> {
> call_srcu(_srcu_struct_nmi, rhp, func);
> }
>
> void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
> {
> call_srcu(_srcu_struct_nonmi, rhp, func);
> }
>
> ...
>
> /* Wait concurrently on the two grace periods. */
> synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);
>
> On the other hand, I bet that doing this is just fine in your use case:
>
> synchronize_srcu(_srcu_struct_nmi);
> synchronize_srcu(_srcu_struct_nonmi);
>
> But please note that synchronize_rcu_mult() is no more in my -rcu tree,
> so if you do want it please let me know (and please let me know why it
> is important).

I did the chaining thing (one callback calling another), that should
work too right? I believe that is needed so that the tracepoint
callbacks are freed at one point and only when both NMI and non-NMI
read sections have completed.

>> >> It does start to seem like a show stopper :-(
>> >
>> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
>> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
>> > whether this would be fast enough to be useful, but easy to provide:
>> >
>> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
>> > {
>> > int idx;
>> >
>> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> > atomic_inc(>sda->srcu_lock_count[idx]);
>> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
>> > section. */
>> > return idx;
>> > }
>> >
>> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> > {
>> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
>> > section. */
>> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> > }
>> >
>> > With appropriate adjustments to also allow Tiny RCU to also work.
>> >
>> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
>> > In fact, the NMI handlers are the one place you -don't- need to use
>> > _nmi(), strangely enough.
>> >
>> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
>> > some architectures, for example.
>>
>> Continuing Steve's question on regular interrupts, do we need to use
>> this atomic_inc API for regular interrupts as well? So I guess
>
> If NMIs use one srcu_struct and non-NMI uses another, the current
> srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
> srcu_struct needs both NMI and non-NMI readers, then we really do need
> __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.

Yes, I believe as long as in_nmi() works reliably, we can use the
right srcu_struct (NMI vs non-NMI) and it would be fine.

Going through this thread, it sounds though that this_cpu_inc may not
be reliable on all architectures even for non-NMI interrupts and
local_inc may be the way to go.

For next merge window (not this one), lets do that then? Paul, if you
could provide me an SRCU API that uses local_inc, then I believe that
coupled with this patch should be all that's needed:
https://lore.kernel.org/patchwork/patch/972657/

Steve did express concern though if in_nmi() works reliably (i.e.
tracepoint doesn't fire from "thunk" code before in_nmi() is
available). Any thoughts on that Steve?

thanks!

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney
 wrote:
[...]
>
>> In that case based on what you're saying, the patch I sent to using
>> different srcu_struct for NMI is still good I guess...
>
> As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
> that there is still a use for synchronize_rcu_mult():
>
> void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
> {
> call_srcu(_srcu_struct_nmi, rhp, func);
> }
>
> void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
> {
> call_srcu(_srcu_struct_nonmi, rhp, func);
> }
>
> ...
>
> /* Wait concurrently on the two grace periods. */
> synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);
>
> On the other hand, I bet that doing this is just fine in your use case:
>
> synchronize_srcu(_srcu_struct_nmi);
> synchronize_srcu(_srcu_struct_nonmi);
>
> But please note that synchronize_rcu_mult() is no more in my -rcu tree,
> so if you do want it please let me know (and please let me know why it
> is important).

I did the chaining thing (one callback calling another), that should
work too right? I believe that is needed so that the tracepoint
callbacks are freed at one point and only when both NMI and non-NMI
read sections have completed.

>> >> It does start to seem like a show stopper :-(
>> >
>> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
>> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
>> > whether this would be fast enough to be useful, but easy to provide:
>> >
>> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
>> > {
>> > int idx;
>> >
>> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
>> > atomic_inc(>sda->srcu_lock_count[idx]);
>> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical 
>> > section. */
>> > return idx;
>> > }
>> >
>> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
>> > {
>> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
>> > section. */
>> > atomic_inc(>sda->srcu_unlock_count[idx]);
>> > }
>> >
>> > With appropriate adjustments to also allow Tiny RCU to also work.
>> >
>> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
>> > In fact, the NMI handlers are the one place you -don't- need to use
>> > _nmi(), strangely enough.
>> >
>> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
>> > some architectures, for example.
>>
>> Continuing Steve's question on regular interrupts, do we need to use
>> this atomic_inc API for regular interrupts as well? So I guess
>
> If NMIs use one srcu_struct and non-NMI uses another, the current
> srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
> srcu_struct needs both NMI and non-NMI readers, then we really do need
> __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.

Yes, I believe as long as in_nmi() works reliably, we can use the
right srcu_struct (NMI vs non-NMI) and it would be fine.

Going through this thread, it sounds though that this_cpu_inc may not
be reliable on all architectures even for non-NMI interrupts and
local_inc may be the way to go.

For next merge window (not this one), lets do that then? Paul, if you
could provide me an SRCU API that uses local_inc, then I believe that
coupled with this patch should be all that's needed:
https://lore.kernel.org/patchwork/patch/972657/

Steve did express concern though if in_nmi() works reliably (i.e.
tracepoint doesn't fire from "thunk" code before in_nmi() is
available). Any thoughts on that Steve?

thanks!

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 12:24:04PM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 09:02:43 -0700
> "Paul E. McKenney"  wrote:
> 
> > > Which leaves us with sparc, arm, mips, sh and powerpc.
> > > 
> > > sh is almost dead, and powerpc can be fixed, which I guess leaves us
> > > with sparc, arm and mips.  
> > 
> > If we want to stick with the current srcu_read_lock() and 
> > srcu_read_unlock(),
> > you mean?  I would like that sort of outcome, at least assuming we are not
> > hammering any of the architectures.
> 
> I would go with the local_inc approach, and even add a
> srcu_read_un/lock_nmi() that does that if you want. Probably should add
> lockdep to detect if the _nmi calls is ever used along with non _nmi
> calls and complain about that.

Would it be reasonable to also add a check for non-_nmi calls being
used in both NMI and non-NMI contexts?

> But this will be something for the next merge window, not the one
> coming up.

Completely agreed!

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 12:24:04PM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 09:02:43 -0700
> "Paul E. McKenney"  wrote:
> 
> > > Which leaves us with sparc, arm, mips, sh and powerpc.
> > > 
> > > sh is almost dead, and powerpc can be fixed, which I guess leaves us
> > > with sparc, arm and mips.  
> > 
> > If we want to stick with the current srcu_read_lock() and 
> > srcu_read_unlock(),
> > you mean?  I would like that sort of outcome, at least assuming we are not
> > hammering any of the architectures.
> 
> I would go with the local_inc approach, and even add a
> srcu_read_un/lock_nmi() that does that if you want. Probably should add
> lockdep to detect if the _nmi calls is ever used along with non _nmi
> calls and complain about that.

Would it be reasonable to also add a check for non-_nmi calls being
used in both NMI and non-NMI contexts?

> But this will be something for the next merge window, not the one
> coming up.

Completely agreed!

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 09:02:43 -0700
"Paul E. McKenney"  wrote:

> > Which leaves us with sparc, arm, mips, sh and powerpc.
> > 
> > sh is almost dead, and powerpc can be fixed, which I guess leaves us
> > with sparc, arm and mips.  
> 
> If we want to stick with the current srcu_read_lock() and srcu_read_unlock(),
> you mean?  I would like that sort of outcome, at least assuming we are not
> hammering any of the architectures.

I would go with the local_inc approach, and even add a
srcu_read_un/lock_nmi() that does that if you want. Probably should add
lockdep to detect if the _nmi calls is ever used along with non _nmi
calls and complain about that.

But this will be something for the next merge window, not the one
coming up.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 09:02:43 -0700
"Paul E. McKenney"  wrote:

> > Which leaves us with sparc, arm, mips, sh and powerpc.
> > 
> > sh is almost dead, and powerpc can be fixed, which I guess leaves us
> > with sparc, arm and mips.  
> 
> If we want to stick with the current srcu_read_lock() and srcu_read_unlock(),
> you mean?  I would like that sort of outcome, at least assuming we are not
> hammering any of the architectures.

I would go with the local_inc approach, and even add a
srcu_read_un/lock_nmi() that does that if you want. Probably should add
lockdep to detect if the _nmi calls is ever used along with non _nmi
calls and complain about that.

But this will be something for the next merge window, not the one
coming up.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 11:27:05AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 07:42:00 -0700
> "Paul E. McKenney"  wrote:
> 
> > > There's also a local_inc() if you are using per cpu pointers, that is
> > > suppose to guarantee atomicity for single cpu operations. That's what
> > > the ftrace ring buffer uses.  
> > 
> > Good point, that becomes atomic_long_inc() or equivalent on most
> > architectures, but an incl instruction (not locked) on x86.  So updating
> > my earlier still-untested thought:
> > 
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> > 
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > local_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> > return idx;
> > }
> > 
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
> > local_inc(>sda->srcu_unlock_count[idx]);
> > }
> > 
> > Would that work, or is there a better way to handle this?
> 
> This would work much better than using the atomic ops, and I think it
> would be doable.

OK, here is hoping!

> I may need to revert the srcu for trace_*_rcidle() for now, as I want
> most of the other changes in this merge window, and it's getting too
> late to do it with these updates.

Agreed, especially since I normally freeze for the next merge window
shortly after -rc5.  ;-)

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 11:27:05AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 07:42:00 -0700
> "Paul E. McKenney"  wrote:
> 
> > > There's also a local_inc() if you are using per cpu pointers, that is
> > > suppose to guarantee atomicity for single cpu operations. That's what
> > > the ftrace ring buffer uses.  
> > 
> > Good point, that becomes atomic_long_inc() or equivalent on most
> > architectures, but an incl instruction (not locked) on x86.  So updating
> > my earlier still-untested thought:
> > 
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> > 
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > local_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> > return idx;
> > }
> > 
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
> > local_inc(>sda->srcu_unlock_count[idx]);
> > }
> > 
> > Would that work, or is there a better way to handle this?
> 
> This would work much better than using the atomic ops, and I think it
> would be doable.

OK, here is hoping!

> I may need to revert the srcu for trace_*_rcidle() for now, as I want
> most of the other changes in this merge window, and it's getting too
> late to do it with these updates.

Agreed, especially since I normally freeze for the next merge window
shortly after -rc5.  ;-)

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 11:23:09AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 08:05:58 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> > > On Wed, 8 Aug 2018 07:33:10 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:  
> > > > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > > 
> > > > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. 
> > > > > > Although
> > > > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > > > interrupt doing the same thing, right?  
> > > > > > 
> > > > > > On architectures without increment-memory instructions, if you take 
> > > > > > an NMI
> > > > > > between the load from sp->sda->srcu_lock_count and the later store, 
> > > > > > you
> > > > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > > > __srcu_read_unlock()
> > > > > > do increments of different locations, so you cannot rely on the 
> > > > > > usual
> > > > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > > > decrementing the same location.
> > > > > 
> > > > > And how is this handled in the interrupt case? Interrupts are not
> > > > > disabled here.
> > > > 
> > > > Actually, on most architectures interrupts are in fact disabled:
> > > > 
> > > > #define this_cpu_generic_to_op(pcp, val, op)
> > > > \
> > > > do {
> > > > \
> > > > unsigned long __flags;  
> > > > \
> > > > raw_local_irq_save(__flags);
> > > > \
> > > > raw_cpu_generic_to_op(pcp, val, op);
> > > > \
> > > > raw_local_irq_restore(__flags); 
> > > > \
> > > > } while (0)
> > > > 
> > > > NMIs, not so much.  
> > > 
> > > And do these archs have NMIs?  
> > 
> > It would appear so:
> 
> Well the next question is, which of these archs that use it are in this
> list.
> 
> > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
> > ./arch/sparc/Kconfig
> > ./arch/s390/Kconfig
> > ./arch/arm/Kconfig
> > ./arch/arm64/Kconfig
> > ./arch/mips/Kconfig
> > ./arch/sh/Kconfig
> > ./arch/powerpc/Kconfig
> 
> Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a
> priority higher than other interrupts.

Plus as you say below, its local_inc() is atomic, and thus NMI-safe,
and thus the _nmi() approach would work.

> > ./arch/x86/Kconfig
> 
> And we get this:
> 
> $ git grep this_cpu_add_4
> arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> _percpu_add(pcp, val)
> arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> arch_this_cpu_to_op_simple(pcp, val, +)
> arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> arch_this_cpu_add(pcp, val, "laa", "asi", int)
> arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val)  
> percpu_add_op((pcp), val)
> 
> Which leaves us with sparc, arm, mips, sh and powerpc.
> 
> sh is almost dead, and powerpc can be fixed, which I guess leaves us
> with sparc, arm and mips.

If we want to stick with the current srcu_read_lock() and srcu_read_unlock(),
you mean?  I would like that sort of outcome, at least assuming we are not
hammering any of the architectures.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 11:23:09AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 08:05:58 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> > > On Wed, 8 Aug 2018 07:33:10 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:  
> > > > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > > > "Paul E. McKenney"  wrote:
> > > > > 
> > > > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. 
> > > > > > Although
> > > > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > > > interrupt doing the same thing, right?  
> > > > > > 
> > > > > > On architectures without increment-memory instructions, if you take 
> > > > > > an NMI
> > > > > > between the load from sp->sda->srcu_lock_count and the later store, 
> > > > > > you
> > > > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > > > __srcu_read_unlock()
> > > > > > do increments of different locations, so you cannot rely on the 
> > > > > > usual
> > > > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > > > decrementing the same location.
> > > > > 
> > > > > And how is this handled in the interrupt case? Interrupts are not
> > > > > disabled here.
> > > > 
> > > > Actually, on most architectures interrupts are in fact disabled:
> > > > 
> > > > #define this_cpu_generic_to_op(pcp, val, op)
> > > > \
> > > > do {
> > > > \
> > > > unsigned long __flags;  
> > > > \
> > > > raw_local_irq_save(__flags);
> > > > \
> > > > raw_cpu_generic_to_op(pcp, val, op);
> > > > \
> > > > raw_local_irq_restore(__flags); 
> > > > \
> > > > } while (0)
> > > > 
> > > > NMIs, not so much.  
> > > 
> > > And do these archs have NMIs?  
> > 
> > It would appear so:
> 
> Well the next question is, which of these archs that use it are in this
> list.
> 
> > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
> > ./arch/sparc/Kconfig
> > ./arch/s390/Kconfig
> > ./arch/arm/Kconfig
> > ./arch/arm64/Kconfig
> > ./arch/mips/Kconfig
> > ./arch/sh/Kconfig
> > ./arch/powerpc/Kconfig
> 
> Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a
> priority higher than other interrupts.

Plus as you say below, its local_inc() is atomic, and thus NMI-safe,
and thus the _nmi() approach would work.

> > ./arch/x86/Kconfig
> 
> And we get this:
> 
> $ git grep this_cpu_add_4
> arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> _percpu_add(pcp, val)
> arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> arch_this_cpu_to_op_simple(pcp, val, +)
> arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
> arch_this_cpu_add(pcp, val, "laa", "asi", int)
> arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val)  
> percpu_add_op((pcp), val)
> 
> Which leaves us with sparc, arm, mips, sh and powerpc.
> 
> sh is almost dead, and powerpc can be fixed, which I guess leaves us
> with sparc, arm and mips.

If we want to stick with the current srcu_read_lock() and srcu_read_unlock(),
you mean?  I would like that sort of outcome, at least assuming we are not
hammering any of the architectures.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 07:42:00 -0700
"Paul E. McKenney"  wrote:

> > There's also a local_inc() if you are using per cpu pointers, that is
> > suppose to guarantee atomicity for single cpu operations. That's what
> > the ftrace ring buffer uses.  
> 
> Good point, that becomes atomic_long_inc() or equivalent on most
> architectures, but an incl instruction (not locked) on x86.  So updating
> my earlier still-untested thought:
> 
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
>   int idx;
> 
>   idx = READ_ONCE(sp->srcu_idx) & 0x1;
>   local_inc(>sda->srcu_lock_count[idx]);
>   smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
>   return idx;
> }
> 
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
>   smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
>   local_inc(>sda->srcu_unlock_count[idx]);
> }
> 
> Would that work, or is there a better way to handle this?

This would work much better than using the atomic ops, and I think it
would be doable.

I may need to revert the srcu for trace_*_rcidle() for now, as I want
most of the other changes in this merge window, and it's getting too
late to do it with these updates.

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 07:42:00 -0700
"Paul E. McKenney"  wrote:

> > There's also a local_inc() if you are using per cpu pointers, that is
> > suppose to guarantee atomicity for single cpu operations. That's what
> > the ftrace ring buffer uses.  
> 
> Good point, that becomes atomic_long_inc() or equivalent on most
> architectures, but an incl instruction (not locked) on x86.  So updating
> my earlier still-untested thought:
> 
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
>   int idx;
> 
>   idx = READ_ONCE(sp->srcu_idx) & 0x1;
>   local_inc(>sda->srcu_lock_count[idx]);
>   smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
>   return idx;
> }
> 
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
>   smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
>   local_inc(>sda->srcu_unlock_count[idx]);
> }
> 
> Would that work, or is there a better way to handle this?

This would work much better than using the atomic ops, and I think it
would be doable.

I may need to revert the srcu for trace_*_rcidle() for now, as I want
most of the other changes in this merge window, and it's getting too
late to do it with these updates.

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 08:05:58 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> > On Wed, 8 Aug 2018 07:33:10 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:  
> > > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. 
> > > > > Although
> > > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > > interrupt doing the same thing, right?  
> > > > > 
> > > > > On architectures without increment-memory instructions, if you take 
> > > > > an NMI
> > > > > between the load from sp->sda->srcu_lock_count and the later store, 
> > > > > you
> > > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > > __srcu_read_unlock()
> > > > > do increments of different locations, so you cannot rely on the usual
> > > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > > decrementing the same location.
> > > > 
> > > > And how is this handled in the interrupt case? Interrupts are not
> > > > disabled here.
> > > 
> > > Actually, on most architectures interrupts are in fact disabled:
> > > 
> > > #define this_cpu_generic_to_op(pcp, val, op)  
> > > \
> > > do {  
> > > \
> > >   unsigned long __flags;  \
> > >   raw_local_irq_save(__flags);\
> > >   raw_cpu_generic_to_op(pcp, val, op);\
> > >   raw_local_irq_restore(__flags); \
> > > } while (0)
> > > 
> > > NMIs, not so much.  
> > 
> > And do these archs have NMIs?  
> 
> It would appear so:

Well the next question is, which of these archs that use it are in this
list.

> 
> $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
> ./arch/sparc/Kconfig
> ./arch/s390/Kconfig
> ./arch/arm/Kconfig
> ./arch/arm64/Kconfig
> ./arch/mips/Kconfig
> ./arch/sh/Kconfig
> ./arch/powerpc/Kconfig

Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a
priority higher than other interrupts.

> ./arch/x86/Kconfig
> 

And we get this:

$ git grep this_cpu_add_4
arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
_percpu_add(pcp, val)
arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
arch_this_cpu_to_op_simple(pcp, val, +)
arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
arch_this_cpu_add(pcp, val, "laa", "asi", int)
arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val)  
percpu_add_op((pcp), val)

Which leaves us with sparc, arm, mips, sh and powerpc.

sh is almost dead, and powerpc can be fixed, which I guess leaves us
with sparc, arm and mips.

-- Steve


>   Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 08:05:58 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> > On Wed, 8 Aug 2018 07:33:10 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:  
> > > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > > "Paul E. McKenney"  wrote:
> > > > 
> > > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. 
> > > > > Although
> > > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > > interrupt doing the same thing, right?  
> > > > > 
> > > > > On architectures without increment-memory instructions, if you take 
> > > > > an NMI
> > > > > between the load from sp->sda->srcu_lock_count and the later store, 
> > > > > you
> > > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > > __srcu_read_unlock()
> > > > > do increments of different locations, so you cannot rely on the usual
> > > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > > decrementing the same location.
> > > > 
> > > > And how is this handled in the interrupt case? Interrupts are not
> > > > disabled here.
> > > 
> > > Actually, on most architectures interrupts are in fact disabled:
> > > 
> > > #define this_cpu_generic_to_op(pcp, val, op)  
> > > \
> > > do {  
> > > \
> > >   unsigned long __flags;  \
> > >   raw_local_irq_save(__flags);\
> > >   raw_cpu_generic_to_op(pcp, val, op);\
> > >   raw_local_irq_restore(__flags); \
> > > } while (0)
> > > 
> > > NMIs, not so much.  
> > 
> > And do these archs have NMIs?  
> 
> It would appear so:

Well the next question is, which of these archs that use it are in this
list.

> 
> $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
> ./arch/sparc/Kconfig
> ./arch/s390/Kconfig
> ./arch/arm/Kconfig
> ./arch/arm64/Kconfig
> ./arch/mips/Kconfig
> ./arch/sh/Kconfig
> ./arch/powerpc/Kconfig

Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a
priority higher than other interrupts.

> ./arch/x86/Kconfig
> 

And we get this:

$ git grep this_cpu_add_4
arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
_percpu_add(pcp, val)
arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
arch_this_cpu_to_op_simple(pcp, val, +)
arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) 
arch_this_cpu_add(pcp, val, "laa", "asi", int)
arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val)  
percpu_add_op((pcp), val)

Which leaves us with sparc, arm, mips, sh and powerpc.

sh is almost dead, and powerpc can be fixed, which I guess leaves us
with sparc, arm and mips.

-- Steve


>   Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 07:33:10 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although  
> > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > interrupt doing the same thing, right?
> > > > 
> > > > On architectures without increment-memory instructions, if you take an 
> > > > NMI
> > > > between the load from sp->sda->srcu_lock_count and the later store, you
> > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > __srcu_read_unlock()
> > > > do increments of different locations, so you cannot rely on the usual
> > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > decrementing the same location.  
> > > 
> > > And how is this handled in the interrupt case? Interrupts are not
> > > disabled here.  
> > 
> > Actually, on most architectures interrupts are in fact disabled:
> > 
> > #define this_cpu_generic_to_op(pcp, val, op)
> > \
> > do {
> > \
> > unsigned long __flags;  \
> > raw_local_irq_save(__flags);\
> > raw_cpu_generic_to_op(pcp, val, op);\
> > raw_local_irq_restore(__flags); \
> > } while (0)
> > 
> > NMIs, not so much.
> 
> And do these archs have NMIs?

It would appear so:

$ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
./arch/sparc/Kconfig
./arch/s390/Kconfig
./arch/arm/Kconfig
./arch/arm64/Kconfig
./arch/mips/Kconfig
./arch/sh/Kconfig
./arch/powerpc/Kconfig
./arch/x86/Kconfig

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 07:33:10 -0700
> "Paul E. McKenney"  wrote:
> 
> > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> > > On Wed, 8 Aug 2018 06:03:02 -0700
> > > "Paul E. McKenney"  wrote:
> > >   
> > > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although  
> > > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > > interrupts being disabled, thus an NMI is no different than any
> > > > > interrupt doing the same thing, right?
> > > > 
> > > > On architectures without increment-memory instructions, if you take an 
> > > > NMI
> > > > between the load from sp->sda->srcu_lock_count and the later store, you
> > > > lose a count.  Note that both __srcu_read_lock() and 
> > > > __srcu_read_unlock()
> > > > do increments of different locations, so you cannot rely on the usual
> > > > "NMI fixes up before exit" semantics you get when incrementing and
> > > > decrementing the same location.  
> > > 
> > > And how is this handled in the interrupt case? Interrupts are not
> > > disabled here.  
> > 
> > Actually, on most architectures interrupts are in fact disabled:
> > 
> > #define this_cpu_generic_to_op(pcp, val, op)
> > \
> > do {
> > \
> > unsigned long __flags;  \
> > raw_local_irq_save(__flags);\
> > raw_cpu_generic_to_op(pcp, val, op);\
> > raw_local_irq_restore(__flags); \
> > } while (0)
> > 
> > NMIs, not so much.
> 
> And do these archs have NMIs?

It would appear so:

$ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \;
./arch/sparc/Kconfig
./arch/s390/Kconfig
./arch/arm/Kconfig
./arch/arm64/Kconfig
./arch/mips/Kconfig
./arch/sh/Kconfig
./arch/powerpc/Kconfig
./arch/x86/Kconfig

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 07:10:53AM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney
>  wrote:
> > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
> >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> >> > Hi Steve,
> >> >
> >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  
> >> > wrote:
> >> [...]
> >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
> >> >>>   } while ((++it_func_ptr)->func);\
> >> >>>   }   \
> >> >>>   \
> >> >>> - if (rcuidle)\
> >> >>> - srcu_read_unlock_notrace(_srcu, idx);\
> >> >>> + srcu_read_unlock_notrace(ss, idx);  \
> >> >>
> >> >> Hmm, why do we have the two different srcu handles?
> >> >
> >> > Because if the memory operations happening on the normal SRCU handle
> >> > (during srcu_read_lock) is interrupted by NMI, then the other handle
> >> > (devoted to NMI) could be used instead and not bother the interrupted
> >> > handle. Does that makes sense?
> >> >
> >> > When I talked to Paul few months ago about SRCU from NMI context, he
> >> > mentioned the per-cpu memory operations during srcu_read_lock can be
> >> > NMI interrupted, that's why we added that warning.
> >>
> >> So I looked more closely, __srcu_read_lock on 2 different handles may
> >> still be doing a this_cpu_inc on the same location..
> >> (sp->sda->srcu_lock_count). :-(
> >>
> >> Paul any ideas on how to solve this?
> >
> > You lost me on this one.  When you said "2 different handles", I assumed
> > that you meant two different values of "sp", which would have two
> > different addresses for >sda->srcu_lock_count.  What am I missing?
> 
> Thanks a lot for the reply.
> I thought "sda" is the same for different srcu_struct(s). May be it
> was too late for me in the night, that's why I thought so? Which makes
> no sense now that I think of it.

I know that feeling!  ;-)

> In that case based on what you're saying, the patch I sent to using
> different srcu_struct for NMI is still good I guess...

As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
that there is still a use for synchronize_rcu_mult():

void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
{
call_srcu(_srcu_struct_nmi, rhp, func);
}

void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
{
call_srcu(_srcu_struct_nonmi, rhp, func);
}

...

/* Wait concurrently on the two grace periods. */
synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);

On the other hand, I bet that doing this is just fine in your use case:

synchronize_srcu(_srcu_struct_nmi);
synchronize_srcu(_srcu_struct_nonmi);

But please note that synchronize_rcu_mult() is no more in my -rcu tree,
so if you do want it please let me know (and please let me know why it
is important).

> >> It does start to seem like a show stopper :-(
> >
> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> > whether this would be fast enough to be useful, but easy to provide:
> >
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> >
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > atomic_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. 
> > */
> > return idx;
> > }
> >
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> > section. */
> > atomic_inc(>sda->srcu_unlock_count[idx]);
> > }
> >
> > With appropriate adjustments to also allow Tiny RCU to also work.
> >
> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> > In fact, the NMI handlers are the one place you -don't- need to use
> > _nmi(), strangely enough.
> >
> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> > some architectures, for example.
> 
> Continuing Steve's question on regular interrupts, do we need to use
> this atomic_inc API for regular interrupts as well?

If NMIs use one srcu_struct and non-NMI uses another, the current
srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
srcu_struct needs both NMI and non-NMI readers, then we really do need
__srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 07:10:53AM -0700, Joel Fernandes wrote:
> On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney
>  wrote:
> > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
> >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> >> > Hi Steve,
> >> >
> >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  
> >> > wrote:
> >> [...]
> >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
> >> >>>   } while ((++it_func_ptr)->func);\
> >> >>>   }   \
> >> >>>   \
> >> >>> - if (rcuidle)\
> >> >>> - srcu_read_unlock_notrace(_srcu, idx);\
> >> >>> + srcu_read_unlock_notrace(ss, idx);  \
> >> >>
> >> >> Hmm, why do we have the two different srcu handles?
> >> >
> >> > Because if the memory operations happening on the normal SRCU handle
> >> > (during srcu_read_lock) is interrupted by NMI, then the other handle
> >> > (devoted to NMI) could be used instead and not bother the interrupted
> >> > handle. Does that makes sense?
> >> >
> >> > When I talked to Paul few months ago about SRCU from NMI context, he
> >> > mentioned the per-cpu memory operations during srcu_read_lock can be
> >> > NMI interrupted, that's why we added that warning.
> >>
> >> So I looked more closely, __srcu_read_lock on 2 different handles may
> >> still be doing a this_cpu_inc on the same location..
> >> (sp->sda->srcu_lock_count). :-(
> >>
> >> Paul any ideas on how to solve this?
> >
> > You lost me on this one.  When you said "2 different handles", I assumed
> > that you meant two different values of "sp", which would have two
> > different addresses for >sda->srcu_lock_count.  What am I missing?
> 
> Thanks a lot for the reply.
> I thought "sda" is the same for different srcu_struct(s). May be it
> was too late for me in the night, that's why I thought so? Which makes
> no sense now that I think of it.

I know that feeling!  ;-)

> In that case based on what you're saying, the patch I sent to using
> different srcu_struct for NMI is still good I guess...

As long as you wait for both SRCU grace periods.  Hmmm...  Maybe that means
that there is still a use for synchronize_rcu_mult():

void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func)
{
call_srcu(_srcu_struct_nmi, rhp, func);
}

void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func)
{
call_srcu(_srcu_struct_nonmi, rhp, func);
}

...

/* Wait concurrently on the two grace periods. */
synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi);

On the other hand, I bet that doing this is just fine in your use case:

synchronize_srcu(_srcu_struct_nmi);
synchronize_srcu(_srcu_struct_nonmi);

But please note that synchronize_rcu_mult() is no more in my -rcu tree,
so if you do want it please let me know (and please let me know why it
is important).

> >> It does start to seem like a show stopper :-(
> >
> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> > whether this would be fast enough to be useful, but easy to provide:
> >
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> >
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > atomic_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. 
> > */
> > return idx;
> > }
> >
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical 
> > section. */
> > atomic_inc(>sda->srcu_unlock_count[idx]);
> > }
> >
> > With appropriate adjustments to also allow Tiny RCU to also work.
> >
> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> > In fact, the NMI handlers are the one place you -don't- need to use
> > _nmi(), strangely enough.
> >
> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> > some architectures, for example.
> 
> Continuing Steve's question on regular interrupts, do we need to use
> this atomic_inc API for regular interrupts as well?

If NMIs use one srcu_struct and non-NMI uses another, the current
srcu_read_lock() and srcu_read_unlock() will work just fine.  If any given
srcu_struct needs both NMI and non-NMI readers, then we really do need
__srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 07:33:10 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> > On Wed, 8 Aug 2018 06:03:02 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although  
> > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > interrupts being disabled, thus an NMI is no different than any
> > > > interrupt doing the same thing, right?
> > > 
> > > On architectures without increment-memory instructions, if you take an NMI
> > > between the load from sp->sda->srcu_lock_count and the later store, you
> > > lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> > > do increments of different locations, so you cannot rely on the usual
> > > "NMI fixes up before exit" semantics you get when incrementing and
> > > decrementing the same location.  
> > 
> > And how is this handled in the interrupt case? Interrupts are not
> > disabled here.  
> 
> Actually, on most architectures interrupts are in fact disabled:
> 
> #define this_cpu_generic_to_op(pcp, val, op)  \
> do {  \
>   unsigned long __flags;  \
>   raw_local_irq_save(__flags);\
>   raw_cpu_generic_to_op(pcp, val, op);\
>   raw_local_irq_restore(__flags); \
> } while (0)
> 
> NMIs, not so much.

And do these archs have NMIs?

-- Steve

> 
> > I would also argue that architectures without increment-memory
> > instructions shouldn't have NMIs ;-)  
> 
> I would also argue a lot of things, but objective reality does not take my
> opinions into account all that often.  Which might be a good thing.  ;-)
> 
>   Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 07:33:10 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> > On Wed, 8 Aug 2018 06:03:02 -0700
> > "Paul E. McKenney"  wrote:
> >   
> > >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although  
> > > > it wont be atomic for the capture of the idx. But I also don't see
> > > > interrupts being disabled, thus an NMI is no different than any
> > > > interrupt doing the same thing, right?
> > > 
> > > On architectures without increment-memory instructions, if you take an NMI
> > > between the load from sp->sda->srcu_lock_count and the later store, you
> > > lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> > > do increments of different locations, so you cannot rely on the usual
> > > "NMI fixes up before exit" semantics you get when incrementing and
> > > decrementing the same location.  
> > 
> > And how is this handled in the interrupt case? Interrupts are not
> > disabled here.  
> 
> Actually, on most architectures interrupts are in fact disabled:
> 
> #define this_cpu_generic_to_op(pcp, val, op)  \
> do {  \
>   unsigned long __flags;  \
>   raw_local_irq_save(__flags);\
>   raw_cpu_generic_to_op(pcp, val, op);\
>   raw_local_irq_restore(__flags); \
> } while (0)
> 
> NMIs, not so much.

And do these archs have NMIs?

-- Steve

> 
> > I would also argue that architectures without increment-memory
> > instructions shouldn't have NMIs ;-)  
> 
> I would also argue a lot of things, but objective reality does not take my
> opinions into account all that often.  Which might be a good thing.  ;-)
> 
>   Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 10:27:00AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 06:00:41 -0700
> "Paul E. McKenney"  wrote:
> > 
> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> > whether this would be fast enough to be useful, but easy to provide:
> > 
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> > 
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > atomic_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> > return idx;
> > }
> > 
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
> > atomic_inc(>sda->srcu_unlock_count[idx]);
> > }
> > 
> > With appropriate adjustments to also allow Tiny RCU to also work.
> > 
> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> > In fact, the NMI handlers are the one place you -don't- need to use
> > _nmi(), strangely enough.
> > 
> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> > some architectures, for example.
> 
> Note this would kill the performance that srcu gives that Joel wants.
> Switching from a this_cpu_inc() to a atomic_inc() would be a huge
> impact.

I don't know how huge it would be, but that concern is exactly why I am
proposing adding _nmi() interfaces rather than just directly changing
the stock __srcu_read_lock() and __srcu_read_unlock() functions.

> There's also a local_inc() if you are using per cpu pointers, that is
> suppose to guarantee atomicity for single cpu operations. That's what
> the ftrace ring buffer uses.

Good point, that becomes atomic_long_inc() or equivalent on most
architectures, but an incl instruction (not locked) on x86.  So updating
my earlier still-untested thought:

int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
{
int idx;

idx = READ_ONCE(sp->srcu_idx) & 0x1;
local_inc(>sda->srcu_lock_count[idx]);
smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
return idx;
}

void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
{
smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
local_inc(>sda->srcu_unlock_count[idx]);
}

Would that work, or is there a better way to handle this?

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 10:27:00AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 06:00:41 -0700
> "Paul E. McKenney"  wrote:
> > 
> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> > be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> > whether this would be fast enough to be useful, but easy to provide:
> > 
> > int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> > {
> > int idx;
> > 
> > idx = READ_ONCE(sp->srcu_idx) & 0x1;
> > atomic_inc(>sda->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> > return idx;
> > }
> > 
> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> > {
> > smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
> > atomic_inc(>sda->srcu_unlock_count[idx]);
> > }
> > 
> > With appropriate adjustments to also allow Tiny RCU to also work.
> > 
> > Note that you have to use _nmi() everywhere, not just in NMI handlers.
> > In fact, the NMI handlers are the one place you -don't- need to use
> > _nmi(), strangely enough.
> > 
> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> > some architectures, for example.
> 
> Note this would kill the performance that srcu gives that Joel wants.
> Switching from a this_cpu_inc() to a atomic_inc() would be a huge
> impact.

I don't know how huge it would be, but that concern is exactly why I am
proposing adding _nmi() interfaces rather than just directly changing
the stock __srcu_read_lock() and __srcu_read_unlock() functions.

> There's also a local_inc() if you are using per cpu pointers, that is
> suppose to guarantee atomicity for single cpu operations. That's what
> the ftrace ring buffer uses.

Good point, that becomes atomic_long_inc() or equivalent on most
architectures, but an incl instruction (not locked) on x86.  So updating
my earlier still-untested thought:

int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
{
int idx;

idx = READ_ONCE(sp->srcu_idx) & 0x1;
local_inc(>sda->srcu_lock_count[idx]);
smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
return idx;
}

void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
{
smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
local_inc(>sda->srcu_unlock_count[idx]);
}

Would that work, or is there a better way to handle this?

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 06:03:02 -0700
> "Paul E. McKenney"  wrote:
> 
> >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> > > it wont be atomic for the capture of the idx. But I also don't see
> > > interrupts being disabled, thus an NMI is no different than any
> > > interrupt doing the same thing, right?  
> > 
> > On architectures without increment-memory instructions, if you take an NMI
> > between the load from sp->sda->srcu_lock_count and the later store, you
> > lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> > do increments of different locations, so you cannot rely on the usual
> > "NMI fixes up before exit" semantics you get when incrementing and
> > decrementing the same location.
> 
> And how is this handled in the interrupt case? Interrupts are not
> disabled here.

Actually, on most architectures interrupts are in fact disabled:

#define this_cpu_generic_to_op(pcp, val, op)\
do {\
unsigned long __flags;  \
raw_local_irq_save(__flags);\
raw_cpu_generic_to_op(pcp, val, op);\
raw_local_irq_restore(__flags); \
} while (0)

NMIs, not so much.

> I would also argue that architectures without increment-memory
> instructions shouldn't have NMIs ;-)

I would also argue a lot of things, but objective reality does not take my
opinions into account all that often.  Which might be a good thing.  ;-)

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote:
> On Wed, 8 Aug 2018 06:03:02 -0700
> "Paul E. McKenney"  wrote:
> 
> >  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> > > it wont be atomic for the capture of the idx. But I also don't see
> > > interrupts being disabled, thus an NMI is no different than any
> > > interrupt doing the same thing, right?  
> > 
> > On architectures without increment-memory instructions, if you take an NMI
> > between the load from sp->sda->srcu_lock_count and the later store, you
> > lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> > do increments of different locations, so you cannot rely on the usual
> > "NMI fixes up before exit" semantics you get when incrementing and
> > decrementing the same location.
> 
> And how is this handled in the interrupt case? Interrupts are not
> disabled here.

Actually, on most architectures interrupts are in fact disabled:

#define this_cpu_generic_to_op(pcp, val, op)\
do {\
unsigned long __flags;  \
raw_local_irq_save(__flags);\
raw_cpu_generic_to_op(pcp, val, op);\
raw_local_irq_restore(__flags); \
} while (0)

NMIs, not so much.

> I would also argue that architectures without increment-memory
> instructions shouldn't have NMIs ;-)

I would also argue a lot of things, but objective reality does not take my
opinions into account all that often.  Which might be a good thing.  ;-)

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 06:00:41 -0700
"Paul E. McKenney"  wrote:
> 
> I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> whether this would be fast enough to be useful, but easy to provide:
> 
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
>   int idx;
> 
>   idx = READ_ONCE(sp->srcu_idx) & 0x1;
>   atomic_inc(>sda->srcu_lock_count[idx]);
>   smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
>   return idx;
> }
> 
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
>   smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
>   atomic_inc(>sda->srcu_unlock_count[idx]);
> }
> 
> With appropriate adjustments to also allow Tiny RCU to also work.
> 
> Note that you have to use _nmi() everywhere, not just in NMI handlers.
> In fact, the NMI handlers are the one place you -don't- need to use
> _nmi(), strangely enough.
> 
> Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> some architectures, for example.

Note this would kill the performance that srcu gives that Joel wants.
Switching from a this_cpu_inc() to a atomic_inc() would be a huge
impact.

There's also a local_inc() if you are using per cpu pointers, that is
suppose to guarantee atomicity for single cpu operations. That's what
the ftrace ring buffer uses.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 06:00:41 -0700
"Paul E. McKenney"  wrote:
> 
> I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> whether this would be fast enough to be useful, but easy to provide:
> 
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
>   int idx;
> 
>   idx = READ_ONCE(sp->srcu_idx) & 0x1;
>   atomic_inc(>sda->srcu_lock_count[idx]);
>   smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
>   return idx;
> }
> 
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
>   smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
>   atomic_inc(>sda->srcu_unlock_count[idx]);
> }
> 
> With appropriate adjustments to also allow Tiny RCU to also work.
> 
> Note that you have to use _nmi() everywhere, not just in NMI handlers.
> In fact, the NMI handlers are the one place you -don't- need to use
> _nmi(), strangely enough.
> 
> Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> some architectures, for example.

Note this would kill the performance that srcu gives that Joel wants.
Switching from a this_cpu_inc() to a atomic_inc() would be a huge
impact.

There's also a local_inc() if you are using per cpu pointers, that is
suppose to guarantee atomicity for single cpu operations. That's what
the ftrace ring buffer uses.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney
 wrote:
> On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
>> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
>> > Hi Steve,
>> >
>> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
>> [...]
>> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>> >>>   } while ((++it_func_ptr)->func);\
>> >>>   }   \
>> >>>   \
>> >>> - if (rcuidle)\
>> >>> - srcu_read_unlock_notrace(_srcu, idx);\
>> >>> + srcu_read_unlock_notrace(ss, idx);  \
>> >>
>> >> Hmm, why do we have the two different srcu handles?
>> >
>> > Because if the memory operations happening on the normal SRCU handle
>> > (during srcu_read_lock) is interrupted by NMI, then the other handle
>> > (devoted to NMI) could be used instead and not bother the interrupted
>> > handle. Does that makes sense?
>> >
>> > When I talked to Paul few months ago about SRCU from NMI context, he
>> > mentioned the per-cpu memory operations during srcu_read_lock can be
>> > NMI interrupted, that's why we added that warning.
>>
>> So I looked more closely, __srcu_read_lock on 2 different handles may
>> still be doing a this_cpu_inc on the same location..
>> (sp->sda->srcu_lock_count). :-(
>>
>> Paul any ideas on how to solve this?
>
> You lost me on this one.  When you said "2 different handles", I assumed
> that you meant two different values of "sp", which would have two
> different addresses for >sda->srcu_lock_count.  What am I missing?

Thanks a lot for the reply.
I thought "sda" is the same for different srcu_struct(s). May be it
was too late for me in the night, that's why I thought so? Which makes
no sense now that I think of it.

In that case based on what you're saying, the patch I sent to using
different srcu_struct for NMI is still good I guess...

>> It does start to seem like a show stopper :-(
>
> I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> whether this would be fast enough to be useful, but easy to provide:
>
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
> int idx;
>
> idx = READ_ONCE(sp->srcu_idx) & 0x1;
> atomic_inc(>sda->srcu_lock_count[idx]);
> smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> return idx;
> }
>
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
> smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. 
> */
> atomic_inc(>sda->srcu_unlock_count[idx]);
> }
>
> With appropriate adjustments to also allow Tiny RCU to also work.
>
> Note that you have to use _nmi() everywhere, not just in NMI handlers.
> In fact, the NMI handlers are the one place you -don't- need to use
> _nmi(), strangely enough.
>
> Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> some architectures, for example.

Continuing Steve's question on regular interrupts, do we need to use
this atomic_inc API for regular interrupts as well?

Thanks!


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Joel Fernandes
On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney
 wrote:
> On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
>> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
>> > Hi Steve,
>> >
>> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
>> [...]
>> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>> >>>   } while ((++it_func_ptr)->func);\
>> >>>   }   \
>> >>>   \
>> >>> - if (rcuidle)\
>> >>> - srcu_read_unlock_notrace(_srcu, idx);\
>> >>> + srcu_read_unlock_notrace(ss, idx);  \
>> >>
>> >> Hmm, why do we have the two different srcu handles?
>> >
>> > Because if the memory operations happening on the normal SRCU handle
>> > (during srcu_read_lock) is interrupted by NMI, then the other handle
>> > (devoted to NMI) could be used instead and not bother the interrupted
>> > handle. Does that makes sense?
>> >
>> > When I talked to Paul few months ago about SRCU from NMI context, he
>> > mentioned the per-cpu memory operations during srcu_read_lock can be
>> > NMI interrupted, that's why we added that warning.
>>
>> So I looked more closely, __srcu_read_lock on 2 different handles may
>> still be doing a this_cpu_inc on the same location..
>> (sp->sda->srcu_lock_count). :-(
>>
>> Paul any ideas on how to solve this?
>
> You lost me on this one.  When you said "2 different handles", I assumed
> that you meant two different values of "sp", which would have two
> different addresses for >sda->srcu_lock_count.  What am I missing?

Thanks a lot for the reply.
I thought "sda" is the same for different srcu_struct(s). May be it
was too late for me in the night, that's why I thought so? Which makes
no sense now that I think of it.

In that case based on what you're saying, the patch I sent to using
different srcu_struct for NMI is still good I guess...

>> It does start to seem like a show stopper :-(
>
> I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
> be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
> whether this would be fast enough to be useful, but easy to provide:
>
> int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
> {
> int idx;
>
> idx = READ_ONCE(sp->srcu_idx) & 0x1;
> atomic_inc(>sda->srcu_lock_count[idx]);
> smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
> return idx;
> }
>
> void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
> {
> smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. 
> */
> atomic_inc(>sda->srcu_unlock_count[idx]);
> }
>
> With appropriate adjustments to also allow Tiny RCU to also work.
>
> Note that you have to use _nmi() everywhere, not just in NMI handlers.
> In fact, the NMI handlers are the one place you -don't- need to use
> _nmi(), strangely enough.
>
> Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
> some architectures, for example.

Continuing Steve's question on regular interrupts, do we need to use
this atomic_inc API for regular interrupts as well?

Thanks!


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 06:03:02 -0700
"Paul E. McKenney"  wrote:

>  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> > it wont be atomic for the capture of the idx. But I also don't see
> > interrupts being disabled, thus an NMI is no different than any
> > interrupt doing the same thing, right?  
> 
> On architectures without increment-memory instructions, if you take an NMI
> between the load from sp->sda->srcu_lock_count and the later store, you
> lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> do increments of different locations, so you cannot rely on the usual
> "NMI fixes up before exit" semantics you get when incrementing and
> decrementing the same location.

And how is this handled in the interrupt case? Interrupts are not
disabled here.

I would also argue that architectures without increment-memory
instructions shouldn't have NMIs ;-)

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Wed, 8 Aug 2018 06:03:02 -0700
"Paul E. McKenney"  wrote:

>  What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> > it wont be atomic for the capture of the idx. But I also don't see
> > interrupts being disabled, thus an NMI is no different than any
> > interrupt doing the same thing, right?  
> 
> On architectures without increment-memory instructions, if you take an NMI
> between the load from sp->sda->srcu_lock_count and the later store, you
> lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
> do increments of different locations, so you cannot rely on the usual
> "NMI fixes up before exit" semantics you get when incrementing and
> decrementing the same location.

And how is this handled in the interrupt case? Interrupts are not
disabled here.

I would also argue that architectures without increment-memory
instructions shouldn't have NMIs ;-)

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 08:46:29AM -0400, Steven Rostedt wrote:
> On Tue, 7 Aug 2018 20:53:54 -0700
> Joel Fernandes  wrote:
> 
> 
> > > When I talked to Paul few months ago about SRCU from NMI context, he
> > > mentioned the per-cpu memory operations during srcu_read_lock can be
> > > NMI interrupted, that's why we added that warning.  
> > 
> > So I looked more closely, __srcu_read_lock on 2 different handles may
> > still be doing a this_cpu_inc on the same location..
> > (sp->sda->srcu_lock_count). :-(
> > 
> > Paul any ideas on how to solve this?
> > 
> > It does start to seem like a show stopper :-(
> 
> What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> it wont be atomic for the capture of the idx. But I also don't see
> interrupts being disabled, thus an NMI is no different than any
> interrupt doing the same thing, right?

On architectures without increment-memory instructions, if you take an NMI
between the load from sp->sda->srcu_lock_count and the later store, you
lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
do increments of different locations, so you cannot rely on the usual
"NMI fixes up before exit" semantics you get when incrementing and
decrementing the same location.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2018 at 08:46:29AM -0400, Steven Rostedt wrote:
> On Tue, 7 Aug 2018 20:53:54 -0700
> Joel Fernandes  wrote:
> 
> 
> > > When I talked to Paul few months ago about SRCU from NMI context, he
> > > mentioned the per-cpu memory operations during srcu_read_lock can be
> > > NMI interrupted, that's why we added that warning.  
> > 
> > So I looked more closely, __srcu_read_lock on 2 different handles may
> > still be doing a this_cpu_inc on the same location..
> > (sp->sda->srcu_lock_count). :-(
> > 
> > Paul any ideas on how to solve this?
> > 
> > It does start to seem like a show stopper :-(
> 
> What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
> it wont be atomic for the capture of the idx. But I also don't see
> interrupts being disabled, thus an NMI is no different than any
> interrupt doing the same thing, right?

On architectures without increment-memory instructions, if you take an NMI
between the load from sp->sda->srcu_lock_count and the later store, you
lose a count.  Note that both __srcu_read_lock() and __srcu_read_unlock()
do increments of different locations, so you cannot rely on the usual
"NMI fixes up before exit" semantics you get when incrementing and
decrementing the same location.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> > Hi Steve,
> >
> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> [...]
> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
> >>>   } while ((++it_func_ptr)->func);\
> >>>   }   \
> >>>   \
> >>> - if (rcuidle)\
> >>> - srcu_read_unlock_notrace(_srcu, idx);\
> >>> + srcu_read_unlock_notrace(ss, idx);  \
> >>
> >> Hmm, why do we have the two different srcu handles?
> >
> > Because if the memory operations happening on the normal SRCU handle
> > (during srcu_read_lock) is interrupted by NMI, then the other handle
> > (devoted to NMI) could be used instead and not bother the interrupted
> > handle. Does that makes sense?
> >
> > When I talked to Paul few months ago about SRCU from NMI context, he
> > mentioned the per-cpu memory operations during srcu_read_lock can be
> > NMI interrupted, that's why we added that warning.
> 
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
> 
> Paul any ideas on how to solve this?

You lost me on this one.  When you said "2 different handles", I assumed
that you meant two different values of "sp", which would have two
different addresses for >sda->srcu_lock_count.  What am I missing?

> It does start to seem like a show stopper :-(

I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
whether this would be fast enough to be useful, but easy to provide:

int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
{
int idx;

idx = READ_ONCE(sp->srcu_idx) & 0x1;
atomic_inc(>sda->srcu_lock_count[idx]);
smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
return idx;
}

void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
{
smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
atomic_inc(>sda->srcu_unlock_count[idx]);
}

With appropriate adjustments to also allow Tiny RCU to also work.

Note that you have to use _nmi() everywhere, not just in NMI handlers.
In fact, the NMI handlers are the one place you -don't- need to use
_nmi(), strangely enough.

Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
some architectures, for example.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Paul E. McKenney
On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote:
> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> > Hi Steve,
> >
> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> [...]
> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
> >>>   } while ((++it_func_ptr)->func);\
> >>>   }   \
> >>>   \
> >>> - if (rcuidle)\
> >>> - srcu_read_unlock_notrace(_srcu, idx);\
> >>> + srcu_read_unlock_notrace(ss, idx);  \
> >>
> >> Hmm, why do we have the two different srcu handles?
> >
> > Because if the memory operations happening on the normal SRCU handle
> > (during srcu_read_lock) is interrupted by NMI, then the other handle
> > (devoted to NMI) could be used instead and not bother the interrupted
> > handle. Does that makes sense?
> >
> > When I talked to Paul few months ago about SRCU from NMI context, he
> > mentioned the per-cpu memory operations during srcu_read_lock can be
> > NMI interrupted, that's why we added that warning.
> 
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
> 
> Paul any ideas on how to solve this?

You lost me on this one.  When you said "2 different handles", I assumed
that you meant two different values of "sp", which would have two
different addresses for >sda->srcu_lock_count.  What am I missing?

> It does start to seem like a show stopper :-(

I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could
be added, which would do atomic ops on sp->sda->srcu_lock_count.  Not sure
whether this would be fast enough to be useful, but easy to provide:

int __srcu_read_lock_nmi(struct srcu_struct *sp)  /* UNTESTED. */
{
int idx;

idx = READ_ONCE(sp->srcu_idx) & 0x1;
atomic_inc(>sda->srcu_lock_count[idx]);
smp_mb__after_atomic(); /* B */  /* Avoid leaking critical section. */
return idx;
}

void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx)
{
smp_mb__before_atomic(); /* C */  /* Avoid leaking critical section. */
atomic_inc(>sda->srcu_unlock_count[idx]);
}

With appropriate adjustments to also allow Tiny RCU to also work.

Note that you have to use _nmi() everywhere, not just in NMI handlers.
In fact, the NMI handlers are the one place you -don't- need to use
_nmi(), strangely enough.

Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on
some architectures, for example.

Thanx, Paul



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Tue, 7 Aug 2018 20:53:54 -0700
Joel Fernandes  wrote:


> > When I talked to Paul few months ago about SRCU from NMI context, he
> > mentioned the per-cpu memory operations during srcu_read_lock can be
> > NMI interrupted, that's why we added that warning.  
> 
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
> 
> Paul any ideas on how to solve this?
> 
> It does start to seem like a show stopper :-(

What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
it wont be atomic for the capture of the idx. But I also don't see
interrupts being disabled, thus an NMI is no different than any
interrupt doing the same thing, right?

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-08 Thread Steven Rostedt
On Tue, 7 Aug 2018 20:53:54 -0700
Joel Fernandes  wrote:


> > When I talked to Paul few months ago about SRCU from NMI context, he
> > mentioned the per-cpu memory operations during srcu_read_lock can be
> > NMI interrupted, that's why we added that warning.  
> 
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
> 
> Paul any ideas on how to solve this?
> 
> It does start to seem like a show stopper :-(

What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although
it wont be atomic for the capture of the idx. But I also don't see
interrupts being disabled, thus an NMI is no different than any
interrupt doing the same thing, right?

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 8:53 PM, Joel Fernandes  wrote:
> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
>> Hi Steve,
>>
>> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> [...]
 @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
   } while ((++it_func_ptr)->func);\
   }   \
   \
 - if (rcuidle)\
 - srcu_read_unlock_notrace(_srcu, idx);\
 + srcu_read_unlock_notrace(ss, idx);  \
>>>
>>> Hmm, why do we have the two different srcu handles?
>>
>> Because if the memory operations happening on the normal SRCU handle
>> (during srcu_read_lock) is interrupted by NMI, then the other handle
>> (devoted to NMI) could be used instead and not bother the interrupted
>> handle. Does that makes sense?
>>
>> When I talked to Paul few months ago about SRCU from NMI context, he
>> mentioned the per-cpu memory operations during srcu_read_lock can be
>> NMI interrupted, that's why we added that warning.
>
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
>
> Paul any ideas on how to solve this?
>
> It does start to seem like a show stopper :-(

Steve, another thing we could do is drop "tracepoint: Make rcuidle
tracepoint callers use SRCU" and just call into irqsoff and preemptoff
tracer hooks directly.

The reason I added "tracepoint: Make rcuidle tracepoint callers use
SRCU" is because lockdep's performance dropped with existing
tracepoint code and SRCU improved that. But now that you're calling
directly into lockdep that shouldn't be an issue.

That should resolve your NMI issues and keep my macro and other clean
ups from the original patch. What do you think?

I am out in the morning, but I could write this up in the evening when
I get some time (unless you do it before me).

thanks,

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 8:53 PM, Joel Fernandes  wrote:
> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
>> Hi Steve,
>>
>> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> [...]
 @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
   } while ((++it_func_ptr)->func);\
   }   \
   \
 - if (rcuidle)\
 - srcu_read_unlock_notrace(_srcu, idx);\
 + srcu_read_unlock_notrace(ss, idx);  \
>>>
>>> Hmm, why do we have the two different srcu handles?
>>
>> Because if the memory operations happening on the normal SRCU handle
>> (during srcu_read_lock) is interrupted by NMI, then the other handle
>> (devoted to NMI) could be used instead and not bother the interrupted
>> handle. Does that makes sense?
>>
>> When I talked to Paul few months ago about SRCU from NMI context, he
>> mentioned the per-cpu memory operations during srcu_read_lock can be
>> NMI interrupted, that's why we added that warning.
>
> So I looked more closely, __srcu_read_lock on 2 different handles may
> still be doing a this_cpu_inc on the same location..
> (sp->sda->srcu_lock_count). :-(
>
> Paul any ideas on how to solve this?
>
> It does start to seem like a show stopper :-(

Steve, another thing we could do is drop "tracepoint: Make rcuidle
tracepoint callers use SRCU" and just call into irqsoff and preemptoff
tracer hooks directly.

The reason I added "tracepoint: Make rcuidle tracepoint callers use
SRCU" is because lockdep's performance dropped with existing
tracepoint code and SRCU improved that. But now that you're calling
directly into lockdep that shouldn't be an issue.

That should resolve your NMI issues and keep my macro and other clean
ups from the original patch. What do you think?

I am out in the morning, but I could write this up in the evening when
I get some time (unless you do it before me).

thanks,

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> Hi Steve,
>
> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
[...]
>>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>>>   } while ((++it_func_ptr)->func);\
>>>   }   \
>>>   \
>>> - if (rcuidle)\
>>> - srcu_read_unlock_notrace(_srcu, idx);\
>>> + srcu_read_unlock_notrace(ss, idx);  \
>>
>> Hmm, why do we have the two different srcu handles?
>
> Because if the memory operations happening on the normal SRCU handle
> (during srcu_read_lock) is interrupted by NMI, then the other handle
> (devoted to NMI) could be used instead and not bother the interrupted
> handle. Does that makes sense?
>
> When I talked to Paul few months ago about SRCU from NMI context, he
> mentioned the per-cpu memory operations during srcu_read_lock can be
> NMI interrupted, that's why we added that warning.

So I looked more closely, __srcu_read_lock on 2 different handles may
still be doing a this_cpu_inc on the same location..
(sp->sda->srcu_lock_count). :-(

Paul any ideas on how to solve this?

It does start to seem like a show stopper :-(


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes  wrote:
> Hi Steve,
>
> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
[...]
>>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>>>   } while ((++it_func_ptr)->func);\
>>>   }   \
>>>   \
>>> - if (rcuidle)\
>>> - srcu_read_unlock_notrace(_srcu, idx);\
>>> + srcu_read_unlock_notrace(ss, idx);  \
>>
>> Hmm, why do we have the two different srcu handles?
>
> Because if the memory operations happening on the normal SRCU handle
> (during srcu_read_lock) is interrupted by NMI, then the other handle
> (devoted to NMI) could be used instead and not bother the interrupted
> handle. Does that makes sense?
>
> When I talked to Paul few months ago about SRCU from NMI context, he
> mentioned the per-cpu memory operations during srcu_read_lock can be
> NMI interrupted, that's why we added that warning.

So I looked more closely, __srcu_read_lock on 2 different handles may
still be doing a this_cpu_inc on the same location..
(sp->sda->srcu_lock_count). :-(

Paul any ideas on how to solve this?

It does start to seem like a show stopper :-(


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
Hi Steve,

On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> On Tue, 7 Aug 2018 19:13:32 -0700
> Joel Fernandes  wrote:
>> >
>> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
>> >> From: "Joel Fernandes (Google)" 
>> >> Date: Sun, 5 Aug 2018 20:17:41 -0700
>> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
>> >>
>> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
>> >> causes a problem for lockdep using tracepoint code. Once a CPU is
>> >> offline, tracepoints donot get called, however this causes a big problem
>> >> for lockdep probes that need to run so that IRQ annotations are marked
>> >> correctly.
>> >>
>> >> A race is possible where while the CPU is going offline, an interrupt
>> >> can come in and then a lockdep assert causes an annotation warning:
>> >>
>> >> [  106.551354] IRQs not enabled as expected
>> >> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>> >>  tick_nohz_idle_enter+0x99/0xb0
>> >> [  106.552964] Modules linked in:
>> >> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
>> >>
>> >> We need tracepoints to run as late as possible. This commit fixes the
>> >> issue by removing the cpu_online check in tracepoint code that was
>> >> introduced in the mentioned commit, however we now switch to using SRCU
>> >> for all tracepoints and special handle calling tracepoints from NMI so
>> >> that we don't run into issues that result from using sched-RCU when the
>> >> CPUs are marked to be offline.
>> >>
>> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>> >>   unify their usage")
>> >> Reported-by: Masami Hiramatsu 
>> >> Signed-off-by: Joel Fernandes (Google) 
>> >
>> >
>> > The above change log doesn't look like it matches the NMI patch.
>> >
>> > Can you resend with just the NMI changes? I already handled the cpu
>> > offline ones.
>>
>> Ok, sent with "cpu offline" changes dropped, here it is:
>> https://lore.kernel.org/patchwork/patch/972657/
>>
>> If you could add your Reported-by to it, that would be great as well.
>>
>> >
>> > But I may still have concerns with this patch.
>>
>> Ok let me know what you think.
>>
>
> Not sure you saw this part of my reply:

I missed the part on the srcu handles, sorry.

>
>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>>   } while ((++it_func_ptr)->func);\
>>   }   \
>>   \
>> - if (rcuidle)\
>> - srcu_read_unlock_notrace(_srcu, idx);\
>> + srcu_read_unlock_notrace(ss, idx);  \
>
> Hmm, why do we have the two different srcu handles?

Because if the memory operations happening on the normal SRCU handle
(during srcu_read_lock) is interrupted by NMI, then the other handle
(devoted to NMI) could be used instead and not bother the interrupted
handle. Does that makes sense?

When I talked to Paul few months ago about SRCU from NMI context, he
mentioned the per-cpu memory operations during srcu_read_lock can be
NMI interrupted, that's why we added that warning.

> Thinking about this, if this can be called by the "thunk" code, then
> there might be an issue. I think the "thunk" code can be called before
> in_nmi() is set, so we don't know if we are in an NMI or not. I need to
> look at that code to make sure. If in_nmi() still works, then we should
> use the _nmi srcu handle (if that's required).
>
> But I'm not sure using SRCU for all tracepoints is needed at this
> moment. I'll have to look deeper into this tomorrow. But it's getting
> close to the merge window, and this needs to be settled quick. Another
> "partial revert" may be needed until this gets settled.

I did read this part, yes I'm not sure either. You mentioned you would
confirm that in the morning, I look forward to it. I hope the in_nmi()
function is reliable to use from here.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
Hi Steve,

On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt  wrote:
> On Tue, 7 Aug 2018 19:13:32 -0700
> Joel Fernandes  wrote:
>> >
>> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
>> >> From: "Joel Fernandes (Google)" 
>> >> Date: Sun, 5 Aug 2018 20:17:41 -0700
>> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
>> >>
>> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
>> >> causes a problem for lockdep using tracepoint code. Once a CPU is
>> >> offline, tracepoints donot get called, however this causes a big problem
>> >> for lockdep probes that need to run so that IRQ annotations are marked
>> >> correctly.
>> >>
>> >> A race is possible where while the CPU is going offline, an interrupt
>> >> can come in and then a lockdep assert causes an annotation warning:
>> >>
>> >> [  106.551354] IRQs not enabled as expected
>> >> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>> >>  tick_nohz_idle_enter+0x99/0xb0
>> >> [  106.552964] Modules linked in:
>> >> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
>> >>
>> >> We need tracepoints to run as late as possible. This commit fixes the
>> >> issue by removing the cpu_online check in tracepoint code that was
>> >> introduced in the mentioned commit, however we now switch to using SRCU
>> >> for all tracepoints and special handle calling tracepoints from NMI so
>> >> that we don't run into issues that result from using sched-RCU when the
>> >> CPUs are marked to be offline.
>> >>
>> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>> >>   unify their usage")
>> >> Reported-by: Masami Hiramatsu 
>> >> Signed-off-by: Joel Fernandes (Google) 
>> >
>> >
>> > The above change log doesn't look like it matches the NMI patch.
>> >
>> > Can you resend with just the NMI changes? I already handled the cpu
>> > offline ones.
>>
>> Ok, sent with "cpu offline" changes dropped, here it is:
>> https://lore.kernel.org/patchwork/patch/972657/
>>
>> If you could add your Reported-by to it, that would be great as well.
>>
>> >
>> > But I may still have concerns with this patch.
>>
>> Ok let me know what you think.
>>
>
> Not sure you saw this part of my reply:

I missed the part on the srcu handles, sorry.

>
>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>>   } while ((++it_func_ptr)->func);\
>>   }   \
>>   \
>> - if (rcuidle)\
>> - srcu_read_unlock_notrace(_srcu, idx);\
>> + srcu_read_unlock_notrace(ss, idx);  \
>
> Hmm, why do we have the two different srcu handles?

Because if the memory operations happening on the normal SRCU handle
(during srcu_read_lock) is interrupted by NMI, then the other handle
(devoted to NMI) could be used instead and not bother the interrupted
handle. Does that makes sense?

When I talked to Paul few months ago about SRCU from NMI context, he
mentioned the per-cpu memory operations during srcu_read_lock can be
NMI interrupted, that's why we added that warning.

> Thinking about this, if this can be called by the "thunk" code, then
> there might be an issue. I think the "thunk" code can be called before
> in_nmi() is set, so we don't know if we are in an NMI or not. I need to
> look at that code to make sure. If in_nmi() still works, then we should
> use the _nmi srcu handle (if that's required).
>
> But I'm not sure using SRCU for all tracepoints is needed at this
> moment. I'll have to look deeper into this tomorrow. But it's getting
> close to the merge window, and this needs to be settled quick. Another
> "partial revert" may be needed until this gets settled.

I did read this part, yes I'm not sure either. You mentioned you would
confirm that in the morning, I look forward to it. I hope the in_nmi()
function is reliable to use from here.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 19:13:32 -0700
Joel Fernandes  wrote:

> On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt  wrote:
> > On Tue, 7 Aug 2018 18:17:42 -0700
> > Joel Fernandes  wrote:
> >  
> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
> >> From: "Joel Fernandes (Google)" 
> >> Date: Sun, 5 Aug 2018 20:17:41 -0700
> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
> >>
> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
> >> causes a problem for lockdep using tracepoint code. Once a CPU is
> >> offline, tracepoints donot get called, however this causes a big problem
> >> for lockdep probes that need to run so that IRQ annotations are marked
> >> correctly.
> >>
> >> A race is possible where while the CPU is going offline, an interrupt
> >> can come in and then a lockdep assert causes an annotation warning:
> >>
> >> [  106.551354] IRQs not enabled as expected
> >> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
> >>  tick_nohz_idle_enter+0x99/0xb0
> >> [  106.552964] Modules linked in:
> >> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
> >>
> >> We need tracepoints to run as late as possible. This commit fixes the
> >> issue by removing the cpu_online check in tracepoint code that was
> >> introduced in the mentioned commit, however we now switch to using SRCU
> >> for all tracepoints and special handle calling tracepoints from NMI so
> >> that we don't run into issues that result from using sched-RCU when the
> >> CPUs are marked to be offline.
> >>
> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
> >>   unify their usage")
> >> Reported-by: Masami Hiramatsu 
> >> Signed-off-by: Joel Fernandes (Google)   
> >
> >
> > The above change log doesn't look like it matches the NMI patch.
> >
> > Can you resend with just the NMI changes? I already handled the cpu
> > offline ones.  
> 
> Ok, sent with "cpu offline" changes dropped, here it is:
> https://lore.kernel.org/patchwork/patch/972657/
> 
> If you could add your Reported-by to it, that would be great as well.
> 
> >
> > But I may still have concerns with this patch.  
> 
> Ok let me know what you think.
> 

Not sure you saw this part of my reply:

> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>   } while ((++it_func_ptr)->func);\
>   }   \
>   \
> - if (rcuidle)\
> - srcu_read_unlock_notrace(_srcu, idx);\
> + srcu_read_unlock_notrace(ss, idx);  \  

Hmm, why do we have the two different srcu handles?

Thinking about this, if this can be called by the "thunk" code, then
there might be an issue. I think the "thunk" code can be called before
in_nmi() is set, so we don't know if we are in an NMI or not. I need to
look at that code to make sure. If in_nmi() still works, then we should
use the _nmi srcu handle (if that's required).

But I'm not sure using SRCU for all tracepoints is needed at this
moment. I'll have to look deeper into this tomorrow. But it's getting
close to the merge window, and this needs to be settled quick. Another
"partial revert" may be needed until this gets settled.


-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 19:13:32 -0700
Joel Fernandes  wrote:

> On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt  wrote:
> > On Tue, 7 Aug 2018 18:17:42 -0700
> > Joel Fernandes  wrote:
> >  
> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
> >> From: "Joel Fernandes (Google)" 
> >> Date: Sun, 5 Aug 2018 20:17:41 -0700
> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
> >>
> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
> >> causes a problem for lockdep using tracepoint code. Once a CPU is
> >> offline, tracepoints donot get called, however this causes a big problem
> >> for lockdep probes that need to run so that IRQ annotations are marked
> >> correctly.
> >>
> >> A race is possible where while the CPU is going offline, an interrupt
> >> can come in and then a lockdep assert causes an annotation warning:
> >>
> >> [  106.551354] IRQs not enabled as expected
> >> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
> >>  tick_nohz_idle_enter+0x99/0xb0
> >> [  106.552964] Modules linked in:
> >> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
> >>
> >> We need tracepoints to run as late as possible. This commit fixes the
> >> issue by removing the cpu_online check in tracepoint code that was
> >> introduced in the mentioned commit, however we now switch to using SRCU
> >> for all tracepoints and special handle calling tracepoints from NMI so
> >> that we don't run into issues that result from using sched-RCU when the
> >> CPUs are marked to be offline.
> >>
> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
> >>   unify their usage")
> >> Reported-by: Masami Hiramatsu 
> >> Signed-off-by: Joel Fernandes (Google)   
> >
> >
> > The above change log doesn't look like it matches the NMI patch.
> >
> > Can you resend with just the NMI changes? I already handled the cpu
> > offline ones.  
> 
> Ok, sent with "cpu offline" changes dropped, here it is:
> https://lore.kernel.org/patchwork/patch/972657/
> 
> If you could add your Reported-by to it, that would be great as well.
> 
> >
> > But I may still have concerns with this patch.  
> 
> Ok let me know what you think.
> 

Not sure you saw this part of my reply:

> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>   } while ((++it_func_ptr)->func);\
>   }   \
>   \
> - if (rcuidle)\
> - srcu_read_unlock_notrace(_srcu, idx);\
> + srcu_read_unlock_notrace(ss, idx);  \  

Hmm, why do we have the two different srcu handles?

Thinking about this, if this can be called by the "thunk" code, then
there might be an issue. I think the "thunk" code can be called before
in_nmi() is set, so we don't know if we are in an NMI or not. I need to
look at that code to make sure. If in_nmi() still works, then we should
use the _nmi srcu handle (if that's required).

But I'm not sure using SRCU for all tracepoints is needed at this
moment. I'll have to look deeper into this tomorrow. But it's getting
close to the merge window, and this needs to be settled quick. Another
"partial revert" may be needed until this gets settled.


-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt  wrote:
> On Tue, 7 Aug 2018 18:17:42 -0700
> Joel Fernandes  wrote:
>
>> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
>> From: "Joel Fernandes (Google)" 
>> Date: Sun, 5 Aug 2018 20:17:41 -0700
>> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
>>
>> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
>> causes a problem for lockdep using tracepoint code. Once a CPU is
>> offline, tracepoints donot get called, however this causes a big problem
>> for lockdep probes that need to run so that IRQ annotations are marked
>> correctly.
>>
>> A race is possible where while the CPU is going offline, an interrupt
>> can come in and then a lockdep assert causes an annotation warning:
>>
>> [  106.551354] IRQs not enabled as expected
>> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>>  tick_nohz_idle_enter+0x99/0xb0
>> [  106.552964] Modules linked in:
>> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
>>
>> We need tracepoints to run as late as possible. This commit fixes the
>> issue by removing the cpu_online check in tracepoint code that was
>> introduced in the mentioned commit, however we now switch to using SRCU
>> for all tracepoints and special handle calling tracepoints from NMI so
>> that we don't run into issues that result from using sched-RCU when the
>> CPUs are marked to be offline.
>>
>> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>>   unify their usage")
>> Reported-by: Masami Hiramatsu 
>> Signed-off-by: Joel Fernandes (Google) 
>
>
> The above change log doesn't look like it matches the NMI patch.
>
> Can you resend with just the NMI changes? I already handled the cpu
> offline ones.

Ok, sent with "cpu offline" changes dropped, here it is:
https://lore.kernel.org/patchwork/patch/972657/

If you could add your Reported-by to it, that would be great as well.

>
> But I may still have concerns with this patch.

Ok let me know what you think.

Thanks!

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt  wrote:
> On Tue, 7 Aug 2018 18:17:42 -0700
> Joel Fernandes  wrote:
>
>> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
>> From: "Joel Fernandes (Google)" 
>> Date: Sun, 5 Aug 2018 20:17:41 -0700
>> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
>>
>> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
>> causes a problem for lockdep using tracepoint code. Once a CPU is
>> offline, tracepoints donot get called, however this causes a big problem
>> for lockdep probes that need to run so that IRQ annotations are marked
>> correctly.
>>
>> A race is possible where while the CPU is going offline, an interrupt
>> can come in and then a lockdep assert causes an annotation warning:
>>
>> [  106.551354] IRQs not enabled as expected
>> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>>  tick_nohz_idle_enter+0x99/0xb0
>> [  106.552964] Modules linked in:
>> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
>>
>> We need tracepoints to run as late as possible. This commit fixes the
>> issue by removing the cpu_online check in tracepoint code that was
>> introduced in the mentioned commit, however we now switch to using SRCU
>> for all tracepoints and special handle calling tracepoints from NMI so
>> that we don't run into issues that result from using sched-RCU when the
>> CPUs are marked to be offline.
>>
>> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>>   unify their usage")
>> Reported-by: Masami Hiramatsu 
>> Signed-off-by: Joel Fernandes (Google) 
>
>
> The above change log doesn't look like it matches the NMI patch.
>
> Can you resend with just the NMI changes? I already handled the cpu
> offline ones.

Ok, sent with "cpu offline" changes dropped, here it is:
https://lore.kernel.org/patchwork/patch/972657/

If you could add your Reported-by to it, that would be great as well.

>
> But I may still have concerns with this patch.

Ok let me know what you think.

Thanks!

 - Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 18:17:42 -0700
Joel Fernandes  wrote:

> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
> From: "Joel Fernandes (Google)" 
> Date: Sun, 5 Aug 2018 20:17:41 -0700
> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
> 
> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
> causes a problem for lockdep using tracepoint code. Once a CPU is
> offline, tracepoints donot get called, however this causes a big problem
> for lockdep probes that need to run so that IRQ annotations are marked
> correctly.
> 
> A race is possible where while the CPU is going offline, an interrupt
> can come in and then a lockdep assert causes an annotation warning:
> 
> [  106.551354] IRQs not enabled as expected
> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>  tick_nohz_idle_enter+0x99/0xb0
> [  106.552964] Modules linked in:
> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
> 
> We need tracepoints to run as late as possible. This commit fixes the
> issue by removing the cpu_online check in tracepoint code that was
> introduced in the mentioned commit, however we now switch to using SRCU
> for all tracepoints and special handle calling tracepoints from NMI so
> that we don't run into issues that result from using sched-RCU when the
> CPUs are marked to be offline.
> 
> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>   unify their usage")
> Reported-by: Masami Hiramatsu 
> Signed-off-by: Joel Fernandes (Google) 


The above change log doesn't look like it matches the NMI patch.

Can you resend with just the NMI changes? I already handled the cpu
offline ones.

But I may still have concerns with this patch.

---
>  include/linux/tracepoint.h | 27 +++
>  kernel/tracepoint.c| 10 ++
>  2 files changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index d9a084c72541..5733502bb3ce 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -35,6 +35,7 @@ struct trace_eval_map {
>  #define TRACEPOINT_DEFAULT_PRIO  10
>  
>  extern struct srcu_struct tracepoint_srcu;
> +extern struct srcu_struct tracepoint_srcu_nmi;
>  
>  extern int
>  tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
> @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void);
>   void *it_func;  \
>   void *__data;   \
>   int __maybe_unused idx = 0; \
> + struct srcu_struct *ss; \
>   \
>   if (!(cond))\
>   return; \
>   \
> - /* srcu can't be used from NMI */   \
> - WARN_ON_ONCE(rcuidle && in_nmi());  \
> - \
>   /* keep srcu and sched-rcu usage consistent */  \
>   preempt_disable_notrace();  \
>   \
> @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void);
>* doesn't work from the idle path. \
>*/ \
>   if (rcuidle)\
> - idx = srcu_read_lock_notrace(_srcu); \
> + ss = _srcu_nmi;  \
> + else\
> + ss = _srcu;  \
> + \
> + idx = srcu_read_lock_notrace(ss);   \
>   \
>   it_func_ptr = rcu_dereference_raw((tp)->funcs); \
>   \
> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>   } while ((++it_func_ptr)->func);\
>   }   \
>   \
> - if (rcuidle)\
> - srcu_read_unlock_notrace(_srcu, idx);\
> + srcu_read_unlock_notrace(ss, idx);  \

Hmm, why do we have the two different srcu 

Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 18:17:42 -0700
Joel Fernandes  wrote:

> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
> From: "Joel Fernandes (Google)" 
> Date: Sun, 5 Aug 2018 20:17:41 -0700
> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline
> 
> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
> causes a problem for lockdep using tracepoint code. Once a CPU is
> offline, tracepoints donot get called, however this causes a big problem
> for lockdep probes that need to run so that IRQ annotations are marked
> correctly.
> 
> A race is possible where while the CPU is going offline, an interrupt
> can come in and then a lockdep assert causes an annotation warning:
> 
> [  106.551354] IRQs not enabled as expected
> [  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
>  tick_nohz_idle_enter+0x99/0xb0
> [  106.552964] Modules linked in:
> [  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW
> 
> We need tracepoints to run as late as possible. This commit fixes the
> issue by removing the cpu_online check in tracepoint code that was
> introduced in the mentioned commit, however we now switch to using SRCU
> for all tracepoints and special handle calling tracepoints from NMI so
> that we don't run into issues that result from using sched-RCU when the
> CPUs are marked to be offline.
> 
> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
>   unify their usage")
> Reported-by: Masami Hiramatsu 
> Signed-off-by: Joel Fernandes (Google) 


The above change log doesn't look like it matches the NMI patch.

Can you resend with just the NMI changes? I already handled the cpu
offline ones.

But I may still have concerns with this patch.

---
>  include/linux/tracepoint.h | 27 +++
>  kernel/tracepoint.c| 10 ++
>  2 files changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index d9a084c72541..5733502bb3ce 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -35,6 +35,7 @@ struct trace_eval_map {
>  #define TRACEPOINT_DEFAULT_PRIO  10
>  
>  extern struct srcu_struct tracepoint_srcu;
> +extern struct srcu_struct tracepoint_srcu_nmi;
>  
>  extern int
>  tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
> @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void);
>   void *it_func;  \
>   void *__data;   \
>   int __maybe_unused idx = 0; \
> + struct srcu_struct *ss; \
>   \
>   if (!(cond))\
>   return; \
>   \
> - /* srcu can't be used from NMI */   \
> - WARN_ON_ONCE(rcuidle && in_nmi());  \
> - \
>   /* keep srcu and sched-rcu usage consistent */  \
>   preempt_disable_notrace();  \
>   \
> @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void);
>* doesn't work from the idle path. \
>*/ \
>   if (rcuidle)\
> - idx = srcu_read_lock_notrace(_srcu); \
> + ss = _srcu_nmi;  \
> + else\
> + ss = _srcu;  \
> + \
> + idx = srcu_read_lock_notrace(ss);   \
>   \
>   it_func_ptr = rcu_dereference_raw((tp)->funcs); \
>   \
> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
>   } while ((++it_func_ptr)->func);\
>   }   \
>   \
> - if (rcuidle)\
> - srcu_read_unlock_notrace(_srcu, idx);\
> + srcu_read_unlock_notrace(ss, idx);  \

Hmm, why do we have the two different srcu 

Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 5:48 PM, Steven Rostedt  wrote:
> On Tue, 07 Aug 2018 19:54:13 -0400
> Joel Fernandes  wrote:
>
>
>> >OK, I hit this bug, but it's not because of the partial revert. This
>> >bug seems it needs to be another partial revert. I like you movement of
>> >the code, but I'm starting to doubt that we can use a trace event as a
>> >hook for critical areas yet. Well, not until we can use srcu in NMI.
>> >
>>
>> I sent a patch to use srcu for all tracepoints including nmi. That
>> patch also removes this warning and fixes the one other issue masami
>> reported (hot plug causing a warning).
>
> Is it one I can take, or does Paul have it? I'll need it to continue
> with this as is, because I can't pass my tests without triggering that
> warning (and that fails the test).

I think you could take it if you're Ok with it, its purely in the
tracing code and I tested it yesterday morning. Its attached to this
email. I tested that it fixes the mmio trace (hotplug related) issue..

>>
>> If Paul and Mathieu can confirm SRCU works on offline CPUs that would
>> be great.
>
> Yes please.
>

BTW I found this in Paul's docs RCU Requirements docs, "SRCU is insensitive
to whether or not a CPU is online" so I think it will work.

The paragraph was..
Also unlike other RCU flavors, SRCU's callbacks-wait function
srcu_barrier() may be invoked from CPU-hotplug notifiers,
though this is not necessarily a good idea.
The reason that this is possible is that SRCU is insensitive
to whether or not a CPU is online, which means that srcu_barrier()
need not exclude CPU-hotplug operations.

>>
>> It's just this one warning or anything else that makes you feel
>> tracepoints for critical paths isn't feasible?
>
> Currently I'm down to this, but I can't get past my first test in my
> ktest suite, because it fails here.

Ok hopefully this will get you past that. Sorry for the frustration.
From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
From: "Joel Fernandes (Google)" 
Date: Sun, 5 Aug 2018 20:17:41 -0700
Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline

Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
causes a problem for lockdep using tracepoint code. Once a CPU is
offline, tracepoints donot get called, however this causes a big problem
for lockdep probes that need to run so that IRQ annotations are marked
correctly.

A race is possible where while the CPU is going offline, an interrupt
can come in and then a lockdep assert causes an annotation warning:

[  106.551354] IRQs not enabled as expected
[  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
 tick_nohz_idle_enter+0x99/0xb0
[  106.552964] Modules linked in:
[  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW

We need tracepoints to run as late as possible. This commit fixes the
issue by removing the cpu_online check in tracepoint code that was
introduced in the mentioned commit, however we now switch to using SRCU
for all tracepoints and special handle calling tracepoints from NMI so
that we don't run into issues that result from using sched-RCU when the
CPUs are marked to be offline.

Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
  unify their usage")
Reported-by: Masami Hiramatsu 
Signed-off-by: Joel Fernandes (Google) 
---
 include/linux/tracepoint.h | 27 +++
 kernel/tracepoint.c| 10 ++
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index d9a084c72541..5733502bb3ce 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -35,6 +35,7 @@ struct trace_eval_map {
 #define TRACEPOINT_DEFAULT_PRIO	10
 
 extern struct srcu_struct tracepoint_srcu;
+extern struct srcu_struct tracepoint_srcu_nmi;
 
 extern int
 tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
@@ -144,13 +145,11 @@ extern void syscall_unregfunc(void);
 		void *it_func;		\
 		void *__data;		\
 		int __maybe_unused idx = 0;\
+		struct srcu_struct *ss;	\
 	\
 		if (!(cond))		\
 			return;		\
 	\
-		/* srcu can't be used from NMI */			\
-		WARN_ON_ONCE(rcuidle && in_nmi());			\
-	\
 		/* keep srcu and sched-rcu usage consistent */		\
 		preempt_disable_notrace();\
 	\
@@ -159,7 +158,11 @@ extern void syscall_unregfunc(void);
 		 * doesn't work from the idle path.			\
 		 */			\
 		if (rcuidle)		\
-			idx = srcu_read_lock_notrace(_srcu);	\
+			ss = _srcu_nmi;			\
+		else			\
+			ss = _srcu;\
+	\
+		idx = srcu_read_lock_notrace(ss);			\
 	\
 		it_func_ptr = rcu_dereference_raw((tp)->funcs);		\
 	\
@@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
 			} while ((++it_func_ptr)->func);		\
 		}			\
 	\
-		if (rcuidle)		\
-			srcu_read_unlock_notrace(_srcu, idx);\
+		

Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Tue, Aug 7, 2018 at 5:48 PM, Steven Rostedt  wrote:
> On Tue, 07 Aug 2018 19:54:13 -0400
> Joel Fernandes  wrote:
>
>
>> >OK, I hit this bug, but it's not because of the partial revert. This
>> >bug seems it needs to be another partial revert. I like you movement of
>> >the code, but I'm starting to doubt that we can use a trace event as a
>> >hook for critical areas yet. Well, not until we can use srcu in NMI.
>> >
>>
>> I sent a patch to use srcu for all tracepoints including nmi. That
>> patch also removes this warning and fixes the one other issue masami
>> reported (hot plug causing a warning).
>
> Is it one I can take, or does Paul have it? I'll need it to continue
> with this as is, because I can't pass my tests without triggering that
> warning (and that fails the test).

I think you could take it if you're Ok with it, its purely in the
tracing code and I tested it yesterday morning. Its attached to this
email. I tested that it fixes the mmio trace (hotplug related) issue..

>>
>> If Paul and Mathieu can confirm SRCU works on offline CPUs that would
>> be great.
>
> Yes please.
>

BTW I found this in Paul's docs RCU Requirements docs, "SRCU is insensitive
to whether or not a CPU is online" so I think it will work.

The paragraph was..
Also unlike other RCU flavors, SRCU's callbacks-wait function
srcu_barrier() may be invoked from CPU-hotplug notifiers,
though this is not necessarily a good idea.
The reason that this is possible is that SRCU is insensitive
to whether or not a CPU is online, which means that srcu_barrier()
need not exclude CPU-hotplug operations.

>>
>> It's just this one warning or anything else that makes you feel
>> tracepoints for critical paths isn't feasible?
>
> Currently I'm down to this, but I can't get past my first test in my
> ktest suite, because it fails here.

Ok hopefully this will get you past that. Sorry for the frustration.
From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001
From: "Joel Fernandes (Google)" 
Date: Sun, 5 Aug 2018 20:17:41 -0700
Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline

Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline")
causes a problem for lockdep using tracepoint code. Once a CPU is
offline, tracepoints donot get called, however this causes a big problem
for lockdep probes that need to run so that IRQ annotations are marked
correctly.

A race is possible where while the CPU is going offline, an interrupt
can come in and then a lockdep assert causes an annotation warning:

[  106.551354] IRQs not enabled as expected
[  106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982
 tick_nohz_idle_enter+0x99/0xb0
[  106.552964] Modules linked in:
[  106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW

We need tracepoints to run as late as possible. This commit fixes the
issue by removing the cpu_online check in tracepoint code that was
introduced in the mentioned commit, however we now switch to using SRCU
for all tracepoints and special handle calling tracepoints from NMI so
that we don't run into issues that result from using sched-RCU when the
CPUs are marked to be offline.

Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and
  unify their usage")
Reported-by: Masami Hiramatsu 
Signed-off-by: Joel Fernandes (Google) 
---
 include/linux/tracepoint.h | 27 +++
 kernel/tracepoint.c| 10 ++
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index d9a084c72541..5733502bb3ce 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -35,6 +35,7 @@ struct trace_eval_map {
 #define TRACEPOINT_DEFAULT_PRIO	10
 
 extern struct srcu_struct tracepoint_srcu;
+extern struct srcu_struct tracepoint_srcu_nmi;
 
 extern int
 tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
@@ -144,13 +145,11 @@ extern void syscall_unregfunc(void);
 		void *it_func;		\
 		void *__data;		\
 		int __maybe_unused idx = 0;\
+		struct srcu_struct *ss;	\
 	\
 		if (!(cond))		\
 			return;		\
 	\
-		/* srcu can't be used from NMI */			\
-		WARN_ON_ONCE(rcuidle && in_nmi());			\
-	\
 		/* keep srcu and sched-rcu usage consistent */		\
 		preempt_disable_notrace();\
 	\
@@ -159,7 +158,11 @@ extern void syscall_unregfunc(void);
 		 * doesn't work from the idle path.			\
 		 */			\
 		if (rcuidle)		\
-			idx = srcu_read_lock_notrace(_srcu);	\
+			ss = _srcu_nmi;			\
+		else			\
+			ss = _srcu;\
+	\
+		idx = srcu_read_lock_notrace(ss);			\
 	\
 		it_func_ptr = rcu_dereference_raw((tp)->funcs);		\
 	\
@@ -171,8 +174,7 @@ extern void syscall_unregfunc(void);
 			} while ((++it_func_ptr)->func);		\
 		}			\
 	\
-		if (rcuidle)		\
-			srcu_read_unlock_notrace(_srcu, idx);\
+		

Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 19:54:13 -0400
Joel Fernandes  wrote:


> >OK, I hit this bug, but it's not because of the partial revert. This
> >bug seems it needs to be another partial revert. I like you movement of
> >the code, but I'm starting to doubt that we can use a trace event as a
> >hook for critical areas yet. Well, not until we can use srcu in NMI.
> >  
> 
> I sent a patch to use srcu for all tracepoints including nmi. That
> patch also removes this warning and fixes the one other issue masami
> reported (hot plug causing a warning).

Is it one I can take, or does Paul have it? I'll need it to continue
with this as is, because I can't pass my tests without triggering that
warning (and that fails the test).

> 
> If Paul and Mathieu can confirm SRCU works on offline CPUs that would
> be great.

Yes please.

> 
> It's just this one warning or anything else that makes you feel
> tracepoints for critical paths isn't feasible?

Currently I'm down to this, but I can't get past my first test in my
ktest suite, because it fails here.

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 19:54:13 -0400
Joel Fernandes  wrote:


> >OK, I hit this bug, but it's not because of the partial revert. This
> >bug seems it needs to be another partial revert. I like you movement of
> >the code, but I'm starting to doubt that we can use a trace event as a
> >hook for critical areas yet. Well, not until we can use srcu in NMI.
> >  
> 
> I sent a patch to use srcu for all tracepoints including nmi. That
> patch also removes this warning and fixes the one other issue masami
> reported (hot plug causing a warning).

Is it one I can take, or does Paul have it? I'll need it to continue
with this as is, because I can't pass my tests without triggering that
warning (and that fails the test).

> 
> If Paul and Mathieu can confirm SRCU works on offline CPUs that would
> be great.

Yes please.

> 
> It's just this one warning or anything else that makes you feel
> tracepoints for critical paths isn't feasible?

Currently I'm down to this, but I can't get past my first test in my
ktest suite, because it fails here.

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 7:45:15 PM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 11:24:13 -0400
>Joel Fernandes  wrote:
>
>> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt
> wrote:
>> >On Tue, 07 Aug 2018 10:48:05 -0400
>> >Joel Fernandes  wrote:
>> >  
>> >> >You mean if someone add a tracepoint callback to the irq disable
>> >> >tracepoint, and did a lockdep assert to make sure interrupts are
>> >> >disabled?
>> >> 
>> >> Yes that's what I meant.  
>> >
>> >That sounds like a "Doctor, it hurts me when I do this" problem ;-) 
>
>> 
>> Haha, yes true. But just to clarify, I didn't do this to see the
>problem but noticed it with turning on existing things. :-) but I see
>your point...
>> 
>
>OK, I hit this bug, but it's not because of the partial revert. This
>bug seems it needs to be another partial revert. I like you movement of
>the code, but I'm starting to doubt that we can use a trace event as a
>hook for critical areas yet. Well, not until we can use srcu in NMI.
>

I sent a patch to use srcu for all tracepoints including nmi. That patch also 
removes this warning and fixes the one other issue masami reported (hot plug 
causing a warning).

If Paul and Mathieu can confirm SRCU works on offline CPUs that would be great.

It's just this one warning or anything else that makes you feel tracepoints for 
critical paths isn't feasible?

Thanks.




>#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
>   do {\
>   struct tracepoint_func *it_func_ptr;\
>   void *it_func;  \
>   void *__data;   \
>   int __maybe_unused idx = 0; \
>   \
>   if (!(cond))\
>   return; \
>   \
>   /* srcu can't be used from NMI */   \
>
>   WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit!
>
>WARNING: CPU: 3 PID: 3727 at
>/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38
>trace_irq_disable_rcuidle+0x2a/0x6c
>Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
>nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev
>parport_pc parport
>CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14
>Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>EIP: trace_irq_disable_rcuidle+0x2a/0x6c
>Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55
>55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89
>d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 
>EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3
>ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0
>DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047
>CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0
>Call Trace:
> trace_hardirqs_off_caller+0x23/0x2d
> trace_hardirqs_off_thunk+0xc/0x10
>EIP: default_do_nmi+0x1/0x157
>
>
>-- Steve

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 7:45:15 PM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 11:24:13 -0400
>Joel Fernandes  wrote:
>
>> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt
> wrote:
>> >On Tue, 07 Aug 2018 10:48:05 -0400
>> >Joel Fernandes  wrote:
>> >  
>> >> >You mean if someone add a tracepoint callback to the irq disable
>> >> >tracepoint, and did a lockdep assert to make sure interrupts are
>> >> >disabled?
>> >> 
>> >> Yes that's what I meant.  
>> >
>> >That sounds like a "Doctor, it hurts me when I do this" problem ;-) 
>
>> 
>> Haha, yes true. But just to clarify, I didn't do this to see the
>problem but noticed it with turning on existing things. :-) but I see
>your point...
>> 
>
>OK, I hit this bug, but it's not because of the partial revert. This
>bug seems it needs to be another partial revert. I like you movement of
>the code, but I'm starting to doubt that we can use a trace event as a
>hook for critical areas yet. Well, not until we can use srcu in NMI.
>

I sent a patch to use srcu for all tracepoints including nmi. That patch also 
removes this warning and fixes the one other issue masami reported (hot plug 
causing a warning).

If Paul and Mathieu can confirm SRCU works on offline CPUs that would be great.

It's just this one warning or anything else that makes you feel tracepoints for 
critical paths isn't feasible?

Thanks.




>#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
>   do {\
>   struct tracepoint_func *it_func_ptr;\
>   void *it_func;  \
>   void *__data;   \
>   int __maybe_unused idx = 0; \
>   \
>   if (!(cond))\
>   return; \
>   \
>   /* srcu can't be used from NMI */   \
>
>   WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit!
>
>WARNING: CPU: 3 PID: 3727 at
>/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38
>trace_irq_disable_rcuidle+0x2a/0x6c
>Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
>nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev
>parport_pc parport
>CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14
>Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>EIP: trace_irq_disable_rcuidle+0x2a/0x6c
>Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55
>55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89
>d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 
>EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3
>ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0
>DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047
>CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0
>Call Trace:
> trace_hardirqs_off_caller+0x23/0x2d
> trace_hardirqs_off_thunk+0xc/0x10
>EIP: default_do_nmi+0x1/0x157
>
>
>-- Steve

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 11:24:13 -0400
Joel Fernandes  wrote:

> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt  wrote:
> >On Tue, 07 Aug 2018 10:48:05 -0400
> >Joel Fernandes  wrote:
> >  
> >> >You mean if someone add a tracepoint callback to the irq disable
> >> >tracepoint, and did a lockdep assert to make sure interrupts are
> >> >disabled?
> >> 
> >> Yes that's what I meant.  
> >
> >That sounds like a "Doctor, it hurts me when I do this" problem ;-)  
> 
> Haha, yes true. But just to clarify, I didn't do this to see the problem but 
> noticed it with turning on existing things. :-) but I see your point...
> 

OK, I hit this bug, but it's not because of the partial revert. This
bug seems it needs to be another partial revert. I like you movement of
the code, but I'm starting to doubt that we can use a trace event as a
hook for critical areas yet. Well, not until we can use srcu in NMI.

#define __DO_TRACE(tp, proto, args, cond, rcuidle)  \
do {\
struct tracepoint_func *it_func_ptr;\
void *it_func;  \
void *__data;   \
int __maybe_unused idx = 0; \
\
if (!(cond))\
return; \
\
/* srcu can't be used from NMI */   \

WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit!

WARNING: CPU: 3 PID: 3727 at 
/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 
trace_irq_disable_rcuidle+0x2a/0x6c
Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 
ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev parport_pc parport
CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
EIP: trace_irq_disable_rcuidle+0x2a/0x6c
Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 55 89 
e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 d7 89 c6 b8 e0 
d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 
EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3
ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047
CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0
Call Trace:
 trace_hardirqs_off_caller+0x23/0x2d
 trace_hardirqs_off_thunk+0xc/0x10
EIP: default_do_nmi+0x1/0x157


-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 11:24:13 -0400
Joel Fernandes  wrote:

> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt  wrote:
> >On Tue, 07 Aug 2018 10:48:05 -0400
> >Joel Fernandes  wrote:
> >  
> >> >You mean if someone add a tracepoint callback to the irq disable
> >> >tracepoint, and did a lockdep assert to make sure interrupts are
> >> >disabled?
> >> 
> >> Yes that's what I meant.  
> >
> >That sounds like a "Doctor, it hurts me when I do this" problem ;-)  
> 
> Haha, yes true. But just to clarify, I didn't do this to see the problem but 
> noticed it with turning on existing things. :-) but I see your point...
> 

OK, I hit this bug, but it's not because of the partial revert. This
bug seems it needs to be another partial revert. I like you movement of
the code, but I'm starting to doubt that we can use a trace event as a
hook for critical areas yet. Well, not until we can use srcu in NMI.

#define __DO_TRACE(tp, proto, args, cond, rcuidle)  \
do {\
struct tracepoint_func *it_func_ptr;\
void *it_func;  \
void *__data;   \
int __maybe_unused idx = 0; \
\
if (!(cond))\
return; \
\
/* srcu can't be used from NMI */   \

WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit!

WARNING: CPU: 3 PID: 3727 at 
/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 
trace_irq_disable_rcuidle+0x2a/0x6c
Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 
ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev parport_pc parport
CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
EIP: trace_irq_disable_rcuidle+0x2a/0x6c
Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 55 89 
e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 d7 89 c6 b8 e0 
d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 
EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3
ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047
CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0
Call Trace:
 trace_hardirqs_off_caller+0x23/0x2d
 trace_hardirqs_off_thunk+0xc/0x10
EIP: default_do_nmi+0x1/0x157


-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 11:09:06 AM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 10:48:05 -0400
>Joel Fernandes  wrote:
>
>> >You mean if someone add a tracepoint callback to the irq disable
>> >tracepoint, and did a lockdep assert to make sure interrupts are
>> >disabled?  
>> 
>> Yes that's what I meant.
>
>That sounds like a "Doctor, it hurts me when I do this" problem ;-)

Haha, yes true. But just to clarify, I didn't do this to see the problem but 
noticed it with turning on existing things. :-) but I see your point...

Thanks,

- Joel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 11:09:06 AM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 10:48:05 -0400
>Joel Fernandes  wrote:
>
>> >You mean if someone add a tracepoint callback to the irq disable
>> >tracepoint, and did a lockdep assert to make sure interrupts are
>> >disabled?  
>> 
>> Yes that's what I meant.
>
>That sounds like a "Doctor, it hurts me when I do this" problem ;-)

Haha, yes true. But just to clarify, I didn't do this to see the problem but 
noticed it with turning on existing things. :-) but I see your point...

Thanks,

- Joel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 10:48:05 -0400
Joel Fernandes  wrote:

> >You mean if someone add a tracepoint callback to the irq disable
> >tracepoint, and did a lockdep assert to make sure interrupts are
> >disabled?  
> 
> Yes that's what I meant.

That sounds like a "Doctor, it hurts me when I do this" problem ;-)

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 10:48:05 -0400
Joel Fernandes  wrote:

> >You mean if someone add a tracepoint callback to the irq disable
> >tracepoint, and did a lockdep assert to make sure interrupts are
> >disabled?  
> 
> Yes that's what I meant.

That sounds like a "Doctor, it hurts me when I do this" problem ;-)

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 10:34:10 AM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 10:10:59 -0400
>Joel Fernandes  wrote:
>
>> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt
> wrote:
>> >On Tue, 7 Aug 2018 06:33:35 -0700
>> >Joel Fernandes  wrote:
>> >  
>> >> Thanks, also one more thing I noticed in your patch,
>> >> lockdep_hardirqs_off needs to be called before all other probes
>but
>> >> you're calling it after. This is why I registered it with INT_MAX:
>> >> 
>> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL,
>INT_MAX);
>> >> 
>> >> Without it you may get annotation warnings. Thanks,  
>> >
>> >Interesting. I was following the old way where we called the tracing
>> >code before calling the lockdep code (all hard coded and not from
>> >trace events). Is this have something to do with calling the code
>from
>> >a tracepoint?
>> >
>> >Do you have an example that could trigger the warning?
>> >  
>> 
>> I remember the warnings but can't remember now how I triggered them.
>> I think I saw them with the irqsoff tracer or irq trace events
>> running, with lockdep turned on.
>
>I'll see if I can trigger it. I'll run this all through my tests.

Ok.

>> Also an irq disable probe that does a lockdep assert that irqs are
>> disabled could trigger it?
>> 
>
>You mean if someone add a tracepoint callback to the irq disable
>tracepoint, and did a lockdep assert to make sure interrupts are
>disabled?

Yes that's what I meant.

Thanks,

- Joel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 10:34:10 AM EDT, Steven Rostedt  wrote:
>On Tue, 07 Aug 2018 10:10:59 -0400
>Joel Fernandes  wrote:
>
>> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt
> wrote:
>> >On Tue, 7 Aug 2018 06:33:35 -0700
>> >Joel Fernandes  wrote:
>> >  
>> >> Thanks, also one more thing I noticed in your patch,
>> >> lockdep_hardirqs_off needs to be called before all other probes
>but
>> >> you're calling it after. This is why I registered it with INT_MAX:
>> >> 
>> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL,
>INT_MAX);
>> >> 
>> >> Without it you may get annotation warnings. Thanks,  
>> >
>> >Interesting. I was following the old way where we called the tracing
>> >code before calling the lockdep code (all hard coded and not from
>> >trace events). Is this have something to do with calling the code
>from
>> >a tracepoint?
>> >
>> >Do you have an example that could trigger the warning?
>> >  
>> 
>> I remember the warnings but can't remember now how I triggered them.
>> I think I saw them with the irqsoff tracer or irq trace events
>> running, with lockdep turned on.
>
>I'll see if I can trigger it. I'll run this all through my tests.

Ok.

>> Also an irq disable probe that does a lockdep assert that irqs are
>> disabled could trigger it?
>> 
>
>You mean if someone add a tracepoint callback to the irq disable
>tracepoint, and did a lockdep assert to make sure interrupts are
>disabled?

Yes that's what I meant.

Thanks,

- Joel

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 10:10:59 -0400
Joel Fernandes  wrote:

> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt  wrote:
> >On Tue, 7 Aug 2018 06:33:35 -0700
> >Joel Fernandes  wrote:
> >  
> >> Thanks, also one more thing I noticed in your patch,
> >> lockdep_hardirqs_off needs to be called before all other probes but
> >> you're calling it after. This is why I registered it with INT_MAX:
> >> 
> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
> >> 
> >> Without it you may get annotation warnings. Thanks,  
> >
> >Interesting. I was following the old way where we called the tracing
> >code before calling the lockdep code (all hard coded and not from
> >trace events). Is this have something to do with calling the code from
> >a tracepoint?
> >
> >Do you have an example that could trigger the warning?
> >  
> 
> I remember the warnings but can't remember now how I triggered them.
> I think I saw them with the irqsoff tracer or irq trace events
> running, with lockdep turned on.

I'll see if I can trigger it. I'll run this all through my tests.

> 
> Also an irq disable probe that does a lockdep assert that irqs are
> disabled could trigger it?
> 

You mean if someone add a tracepoint callback to the irq disable
tracepoint, and did a lockdep assert to make sure interrupts are
disabled?

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 07 Aug 2018 10:10:59 -0400
Joel Fernandes  wrote:

> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt  wrote:
> >On Tue, 7 Aug 2018 06:33:35 -0700
> >Joel Fernandes  wrote:
> >  
> >> Thanks, also one more thing I noticed in your patch,
> >> lockdep_hardirqs_off needs to be called before all other probes but
> >> you're calling it after. This is why I registered it with INT_MAX:
> >> 
> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
> >> 
> >> Without it you may get annotation warnings. Thanks,  
> >
> >Interesting. I was following the old way where we called the tracing
> >code before calling the lockdep code (all hard coded and not from
> >trace events). Is this have something to do with calling the code from
> >a tracepoint?
> >
> >Do you have an example that could trigger the warning?
> >  
> 
> I remember the warnings but can't remember now how I triggered them.
> I think I saw them with the irqsoff tracer or irq trace events
> running, with lockdep turned on.

I'll see if I can trigger it. I'll run this all through my tests.

> 
> Also an irq disable probe that does a lockdep assert that irqs are
> disabled could trigger it?
> 

You mean if someone add a tracepoint callback to the irq disable
tracepoint, and did a lockdep assert to make sure interrupts are
disabled?

-- Steve



Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 9:49:54 AM EDT, Steven Rostedt  wrote:
>On Tue, 7 Aug 2018 06:33:35 -0700
>Joel Fernandes  wrote:
>
>> Thanks, also one more thing I noticed in your patch,
>> lockdep_hardirqs_off needs to be called before all other probes but
>> you're calling it after. This is why I registered it with INT_MAX:
>> 
>> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
>> 
>> Without it you may get annotation warnings. Thanks,
>
>Interesting. I was following the old way where we called the tracing
>code before calling the lockdep code (all hard coded and not from
>trace events). Is this have something to do with calling the code from
>a tracepoint?
>
>Do you have an example that could trigger the warning?
>

I remember the warnings but can't remember now how I triggered them. I think I 
saw them with the irqsoff tracer or irq trace events running, with lockdep 
turned on.

Also an irq disable probe that does a lockdep assert that irqs are disabled 
could trigger it?

thanks,

- Joel


>-- Steve

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes



On August 7, 2018 9:49:54 AM EDT, Steven Rostedt  wrote:
>On Tue, 7 Aug 2018 06:33:35 -0700
>Joel Fernandes  wrote:
>
>> Thanks, also one more thing I noticed in your patch,
>> lockdep_hardirqs_off needs to be called before all other probes but
>> you're calling it after. This is why I registered it with INT_MAX:
>> 
>> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
>> 
>> Without it you may get annotation warnings. Thanks,
>
>Interesting. I was following the old way where we called the tracing
>code before calling the lockdep code (all hard coded and not from
>trace events). Is this have something to do with calling the code from
>a tracepoint?
>
>Do you have an example that could trigger the warning?
>

I remember the warnings but can't remember now how I triggered them. I think I 
saw them with the irqsoff tracer or irq trace events running, with lockdep 
turned on.

Also an irq disable probe that does a lockdep assert that irqs are disabled 
could trigger it?

thanks,

- Joel


>-- Steve

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 06:33:35 -0700
Joel Fernandes  wrote:

> Thanks, also one more thing I noticed in your patch,
> lockdep_hardirqs_off needs to be called before all other probes but
> you're calling it after. This is why I registered it with INT_MAX:
> 
> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
> 
> Without it you may get annotation warnings. Thanks,

Interesting. I was following the old way where we called the tracing
code before calling the lockdep code (all hard coded and not from
trace events). Is this have something to do with calling the code from
a tracepoint?

Do you have an example that could trigger the warning?

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Steven Rostedt
On Tue, 7 Aug 2018 06:33:35 -0700
Joel Fernandes  wrote:

> Thanks, also one more thing I noticed in your patch,
> lockdep_hardirqs_off needs to be called before all other probes but
> you're calling it after. This is why I registered it with INT_MAX:
> 
> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
> 
> Without it you may get annotation warnings. Thanks,

Interesting. I was following the old way where we called the tracing
code before calling the lockdep code (all hard coded and not from
trace events). Is this have something to do with calling the code from
a tracepoint?

Do you have an example that could trigger the warning?

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Mon, Aug 6, 2018 at 6:43 PM, Steven Rostedt  wrote:
> On Mon, 6 Aug 2018 17:43:19 -0700
> Joel Fernandes  wrote:
>
>> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
>> >
>> > With this patch applied, I'm constantly getting lockdep errors. Instead
>> > of doing a full revert of the patch, I did this, which makes all those
>> > errors go away. I may apply this for now, and we can revisit having
>> > lockdep use the tracepoint code. But since it's currently always
>> > enabled, I'm thinking of just leaving this as is. The macros are still
>> > clean from Joel's patch.
>> >
>> > Thoughts?
>>
>> I like your patch. Thanks a lot for doing this.. It keeps most of the
>> benefits of my patch while avoiding the issues with lockdep. I agree
>> we can look at the remaining lockdep issue after. There were several
>> lockdep issues with this patch that I fixed over the the months, but
>> there's still the one that Masami reported that I believe you're also
>> seeing. Once I'm back I'll work on figuring that one out.
>>
>> Could you pull in the fixes to the other issues I posted though? With
>> that we should be good.
>> https://lore.kernel.org/patchwork/patch/971104/
>> https://lore.kernel.org/patchwork/patch/971829/
>>
>
> I already had these applied when I created this patch ;-)
>
> Thanks, I'll add it.

Thanks, also one more thing I noticed in your patch,
lockdep_hardirqs_off needs to be called before all other probes but
you're calling it after. This is why I registered it with INT_MAX:

register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);

Without it you may get annotation warnings. Thanks,

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-07 Thread Joel Fernandes
On Mon, Aug 6, 2018 at 6:43 PM, Steven Rostedt  wrote:
> On Mon, 6 Aug 2018 17:43:19 -0700
> Joel Fernandes  wrote:
>
>> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
>> >
>> > With this patch applied, I'm constantly getting lockdep errors. Instead
>> > of doing a full revert of the patch, I did this, which makes all those
>> > errors go away. I may apply this for now, and we can revisit having
>> > lockdep use the tracepoint code. But since it's currently always
>> > enabled, I'm thinking of just leaving this as is. The macros are still
>> > clean from Joel's patch.
>> >
>> > Thoughts?
>>
>> I like your patch. Thanks a lot for doing this.. It keeps most of the
>> benefits of my patch while avoiding the issues with lockdep. I agree
>> we can look at the remaining lockdep issue after. There were several
>> lockdep issues with this patch that I fixed over the the months, but
>> there's still the one that Masami reported that I believe you're also
>> seeing. Once I'm back I'll work on figuring that one out.
>>
>> Could you pull in the fixes to the other issues I posted though? With
>> that we should be good.
>> https://lore.kernel.org/patchwork/patch/971104/
>> https://lore.kernel.org/patchwork/patch/971829/
>>
>
> I already had these applied when I created this patch ;-)
>
> Thanks, I'll add it.

Thanks, also one more thing I noticed in your patch,
lockdep_hardirqs_off needs to be called before all other probes but
you're calling it after. This is why I registered it with INT_MAX:

register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);

Without it you may get annotation warnings. Thanks,

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Steven Rostedt
On Mon, 6 Aug 2018 17:43:19 -0700
Joel Fernandes  wrote:

> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
> >
> > With this patch applied, I'm constantly getting lockdep errors. Instead
> > of doing a full revert of the patch, I did this, which makes all those
> > errors go away. I may apply this for now, and we can revisit having
> > lockdep use the tracepoint code. But since it's currently always
> > enabled, I'm thinking of just leaving this as is. The macros are still
> > clean from Joel's patch.
> >
> > Thoughts?  
> 
> I like your patch. Thanks a lot for doing this.. It keeps most of the
> benefits of my patch while avoiding the issues with lockdep. I agree
> we can look at the remaining lockdep issue after. There were several
> lockdep issues with this patch that I fixed over the the months, but
> there's still the one that Masami reported that I believe you're also
> seeing. Once I'm back I'll work on figuring that one out.
> 
> Could you pull in the fixes to the other issues I posted though? With
> that we should be good.
> https://lore.kernel.org/patchwork/patch/971104/
> https://lore.kernel.org/patchwork/patch/971829/
>

I already had these applied when I created this patch ;-)

Thanks, I'll add it.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Steven Rostedt
On Mon, 6 Aug 2018 17:43:19 -0700
Joel Fernandes  wrote:

> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
> >
> > With this patch applied, I'm constantly getting lockdep errors. Instead
> > of doing a full revert of the patch, I did this, which makes all those
> > errors go away. I may apply this for now, and we can revisit having
> > lockdep use the tracepoint code. But since it's currently always
> > enabled, I'm thinking of just leaving this as is. The macros are still
> > clean from Joel's patch.
> >
> > Thoughts?  
> 
> I like your patch. Thanks a lot for doing this.. It keeps most of the
> benefits of my patch while avoiding the issues with lockdep. I agree
> we can look at the remaining lockdep issue after. There were several
> lockdep issues with this patch that I fixed over the the months, but
> there's still the one that Masami reported that I believe you're also
> seeing. Once I'm back I'll work on figuring that one out.
> 
> Could you pull in the fixes to the other issues I posted though? With
> that we should be good.
> https://lore.kernel.org/patchwork/patch/971104/
> https://lore.kernel.org/patchwork/patch/971829/
>

I already had these applied when I created this patch ;-)

Thanks, I'll add it.

-- Steve


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Joel Fernandes
On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
>
> With this patch applied, I'm constantly getting lockdep errors. Instead
> of doing a full revert of the patch, I did this, which makes all those
> errors go away. I may apply this for now, and we can revisit having
> lockdep use the tracepoint code. But since it's currently always
> enabled, I'm thinking of just leaving this as is. The macros are still
> clean from Joel's patch.
>
> Thoughts?

I like your patch. Thanks a lot for doing this.. It keeps most of the
benefits of my patch while avoiding the issues with lockdep. I agree
we can look at the remaining lockdep issue after. There were several
lockdep issues with this patch that I fixed over the the months, but
there's still the one that Masami reported that I believe you're also
seeing. Once I'm back I'll work on figuring that one out.

Could you pull in the fixes to the other issues I posted though? With
that we should be good.
https://lore.kernel.org/patchwork/patch/971104/
https://lore.kernel.org/patchwork/patch/971829/

thanks,

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Joel Fernandes
On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt  wrote:
>
> With this patch applied, I'm constantly getting lockdep errors. Instead
> of doing a full revert of the patch, I did this, which makes all those
> errors go away. I may apply this for now, and we can revisit having
> lockdep use the tracepoint code. But since it's currently always
> enabled, I'm thinking of just leaving this as is. The macros are still
> clean from Joel's patch.
>
> Thoughts?

I like your patch. Thanks a lot for doing this.. It keeps most of the
benefits of my patch while avoiding the issues with lockdep. I agree
we can look at the remaining lockdep issue after. There were several
lockdep issues with this patch that I fixed over the the months, but
there's still the one that Masami reported that I believe you're also
seeing. Once I'm back I'll work on figuring that one out.

Could you pull in the fixes to the other issues I posted though? With
that we should be good.
https://lore.kernel.org/patchwork/patch/971104/
https://lore.kernel.org/patchwork/patch/971829/

thanks,

- Joel


Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Steven Rostedt


With this patch applied, I'm constantly getting lockdep errors. Instead
of doing a full revert of the patch, I did this, which makes all those
errors go away. I may apply this for now, and we can revisit having
lockdep use the tracepoint code. But since it's currently always
enabled, I'm thinking of just leaving this as is. The macros are still
clean from Joel's patch.

Thoughts?

-- Steve

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 50edb9cbbd26..a93476c6c954 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -19,6 +19,8 @@
 #ifdef CONFIG_PROVE_LOCKING
   extern void trace_softirqs_on(unsigned long ip);
   extern void trace_softirqs_off(unsigned long ip);
+  extern void lockdep_hardirqs_on(unsigned long ip);
+  extern void lockdep_hardirqs_off(unsigned long ip);
 #else
 # define trace_softirqs_on(ip) do { } while (0)
 # define trace_softirqs_off(ip)do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index a8113357ceeb..30d0eb857f2e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -267,7 +267,6 @@ struct held_lock {
  * Initialization, self-test and debugging-output methods:
  */
 extern void lockdep_init(void);
-extern void lockdep_init_early(void);
 extern void lockdep_reset(void);
 extern void lockdep_reset_lock(struct lockdep_map *lock);
 extern void lockdep_free_key_range(void *start, unsigned long size);
@@ -408,7 +407,6 @@ static inline void lockdep_on(void)
 # define lock_set_class(l, n, k, s, i) do { } while (0)
 # define lock_set_subclass(l, s, i)do { } while (0)
 # define lockdep_init()do { } while (0)
-# define lockdep_init_early()  do { } while (0)
 # define lockdep_init_map(lock, name, key, sub) \
do { (void)(name); (void)(key); } while (0)
 # define lockdep_set_class(lock, key)  do { (void)(key); } while (0)
diff --git a/init/main.c b/init/main.c
index 44fe43be84c1..5d42e577643a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -649,8 +649,6 @@ asmlinkage __visible void __init start_kernel(void)
call_function_init();
WARN(!irqs_disabled(), "Interrupts were enabled early\n");
 
-   lockdep_init_early();
-
early_boot_irqs_disabled = false;
local_irq_enable();
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 03bfaeb9f4e6..e406c5fdb41e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2840,8 +2840,7 @@ static void __trace_hardirqs_on_caller(unsigned long ip)
debug_atomic_inc(hardirqs_on_events);
 }
 
-static void lockdep_hardirqs_on(void *none, unsigned long ignore,
-   unsigned long ip)
+void lockdep_hardirqs_on(unsigned long ip)
 {
if (unlikely(!debug_locks || current->lockdep_recursion))
return;
@@ -2885,8 +2884,7 @@ static void lockdep_hardirqs_on(void *none, unsigned long 
ignore,
 /*
  * Hardirqs were disabled:
  */
-static void lockdep_hardirqs_off(void *none, unsigned long ignore,
-unsigned long ip)
+void lockdep_hardirqs_off(unsigned long ip)
 {
struct task_struct *curr = current;
 
@@ -4315,14 +4313,6 @@ void lockdep_reset_lock(struct lockdep_map *lock)
raw_local_irq_restore(flags);
 }
 
-void __init lockdep_init_early(void)
-{
-#ifdef CONFIG_PROVE_LOCKING
-   register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
-   register_trace_prio_irq_enable(lockdep_hardirqs_on, NULL, INT_MIN);
-#endif
-}
-
 void __init lockdep_init(void)
 {
printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., 
Ingo Molnar\n");
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index e76b78bf258e..8326adcab0dc 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -20,40 +20,52 @@ static DEFINE_PER_CPU(int, tracing_irq_cpu);
 void trace_hardirqs_on(void)
 {
if (!this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
this_cpu_write(tracing_irq_cpu, 0);
+
+ out:
+   lockdep_hardirqs_on(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_on);
 
 void trace_hardirqs_off(void)
 {
if (this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
this_cpu_write(tracing_irq_cpu, 1);
trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
+
+ out:
+   lockdep_hardirqs_off(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off);
 
 __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
 {
if (!this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
this_cpu_write(tracing_irq_cpu, 0);
+
+ out:
+   lockdep_hardirqs_on(CALLER_ADDR0);
 }
 

Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage

2018-08-06 Thread Steven Rostedt


With this patch applied, I'm constantly getting lockdep errors. Instead
of doing a full revert of the patch, I did this, which makes all those
errors go away. I may apply this for now, and we can revisit having
lockdep use the tracepoint code. But since it's currently always
enabled, I'm thinking of just leaving this as is. The macros are still
clean from Joel's patch.

Thoughts?

-- Steve

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 50edb9cbbd26..a93476c6c954 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -19,6 +19,8 @@
 #ifdef CONFIG_PROVE_LOCKING
   extern void trace_softirqs_on(unsigned long ip);
   extern void trace_softirqs_off(unsigned long ip);
+  extern void lockdep_hardirqs_on(unsigned long ip);
+  extern void lockdep_hardirqs_off(unsigned long ip);
 #else
 # define trace_softirqs_on(ip) do { } while (0)
 # define trace_softirqs_off(ip)do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index a8113357ceeb..30d0eb857f2e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -267,7 +267,6 @@ struct held_lock {
  * Initialization, self-test and debugging-output methods:
  */
 extern void lockdep_init(void);
-extern void lockdep_init_early(void);
 extern void lockdep_reset(void);
 extern void lockdep_reset_lock(struct lockdep_map *lock);
 extern void lockdep_free_key_range(void *start, unsigned long size);
@@ -408,7 +407,6 @@ static inline void lockdep_on(void)
 # define lock_set_class(l, n, k, s, i) do { } while (0)
 # define lock_set_subclass(l, s, i)do { } while (0)
 # define lockdep_init()do { } while (0)
-# define lockdep_init_early()  do { } while (0)
 # define lockdep_init_map(lock, name, key, sub) \
do { (void)(name); (void)(key); } while (0)
 # define lockdep_set_class(lock, key)  do { (void)(key); } while (0)
diff --git a/init/main.c b/init/main.c
index 44fe43be84c1..5d42e577643a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -649,8 +649,6 @@ asmlinkage __visible void __init start_kernel(void)
call_function_init();
WARN(!irqs_disabled(), "Interrupts were enabled early\n");
 
-   lockdep_init_early();
-
early_boot_irqs_disabled = false;
local_irq_enable();
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 03bfaeb9f4e6..e406c5fdb41e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2840,8 +2840,7 @@ static void __trace_hardirqs_on_caller(unsigned long ip)
debug_atomic_inc(hardirqs_on_events);
 }
 
-static void lockdep_hardirqs_on(void *none, unsigned long ignore,
-   unsigned long ip)
+void lockdep_hardirqs_on(unsigned long ip)
 {
if (unlikely(!debug_locks || current->lockdep_recursion))
return;
@@ -2885,8 +2884,7 @@ static void lockdep_hardirqs_on(void *none, unsigned long 
ignore,
 /*
  * Hardirqs were disabled:
  */
-static void lockdep_hardirqs_off(void *none, unsigned long ignore,
-unsigned long ip)
+void lockdep_hardirqs_off(unsigned long ip)
 {
struct task_struct *curr = current;
 
@@ -4315,14 +4313,6 @@ void lockdep_reset_lock(struct lockdep_map *lock)
raw_local_irq_restore(flags);
 }
 
-void __init lockdep_init_early(void)
-{
-#ifdef CONFIG_PROVE_LOCKING
-   register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX);
-   register_trace_prio_irq_enable(lockdep_hardirqs_on, NULL, INT_MIN);
-#endif
-}
-
 void __init lockdep_init(void)
 {
printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., 
Ingo Molnar\n");
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index e76b78bf258e..8326adcab0dc 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -20,40 +20,52 @@ static DEFINE_PER_CPU(int, tracing_irq_cpu);
 void trace_hardirqs_on(void)
 {
if (!this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
this_cpu_write(tracing_irq_cpu, 0);
+
+ out:
+   lockdep_hardirqs_on(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_on);
 
 void trace_hardirqs_off(void)
 {
if (this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
this_cpu_write(tracing_irq_cpu, 1);
trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
+
+ out:
+   lockdep_hardirqs_off(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off);
 
 __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
 {
if (!this_cpu_read(tracing_irq_cpu))
-   return;
+   goto out;
 
trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr);
this_cpu_write(tracing_irq_cpu, 0);
+
+ out:
+   lockdep_hardirqs_on(CALLER_ADDR0);
 }