Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-08-18 Thread peterz
On Mon, Aug 17, 2020 at 06:00:05AM -0700, Paul E. McKenney wrote:
> On Mon, Aug 17, 2020 at 11:16:33AM +0200, pet...@infradead.org wrote:
> > On Mon, Aug 17, 2020 at 11:03:25AM +0200, pet...@infradead.org wrote:
> > > On Thu, Jul 23, 2020 at 09:14:11AM -0700, Paul E. McKenney wrote:
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -1287,8 +1287,6 @@ static int rcu_implicit_dynticks_qs(stru
> > > > >   if (IS_ENABLED(CONFIG_IRQ_WORK) &&
> > > > >   !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != 
> > > > > rnp->gp_seq &&
> > > > >   (rnp->ffmask & rdp->grpmask)) {
> > > > > - init_irq_work(>rcu_iw, rcu_iw_handler);
> > > > 
> > > > We are actually better off with the IRQ_WORK_INIT_HARD() here rather
> > > > than unconditionally at boot.
> > > 
> > > Ah, but there isn't an init_irq_work() variant that does the HARD thing.
> > 
> > Ah you meant doing:
> > 
> > rdp->rcu_iw = IRQ_WORK_INIT_HARD(rcu_iw_handler)
> > 
> > But then it is non-obvious how that doesn't trample state. I suppose
> > that rcu_iw_pending thing ensures that... I'll think about it.
> 
> Yes, this is what I had in mind.  And you are right, the point of the
> !rdp->rcu_iw_pending check is to prevent initialization while still
> in use.

So I checked my notes, and the plan was to replace rcu_iw_pending with
irq_work pending bit, but for that we musnt't clobber that state every
time.




Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-08-17 Thread Paul E. McKenney
On Mon, Aug 17, 2020 at 11:16:33AM +0200, pet...@infradead.org wrote:
> On Mon, Aug 17, 2020 at 11:03:25AM +0200, pet...@infradead.org wrote:
> > On Thu, Jul 23, 2020 at 09:14:11AM -0700, Paul E. McKenney wrote:
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -1287,8 +1287,6 @@ static int rcu_implicit_dynticks_qs(stru
> > > > if (IS_ENABLED(CONFIG_IRQ_WORK) &&
> > > > !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != 
> > > > rnp->gp_seq &&
> > > > (rnp->ffmask & rdp->grpmask)) {
> > > > -   init_irq_work(>rcu_iw, rcu_iw_handler);
> > > 
> > > We are actually better off with the IRQ_WORK_INIT_HARD() here rather
> > > than unconditionally at boot.
> > 
> > Ah, but there isn't an init_irq_work() variant that does the HARD thing.
> 
> Ah you meant doing:
> 
>   rdp->rcu_iw = IRQ_WORK_INIT_HARD(rcu_iw_handler)
> 
> But then it is non-obvious how that doesn't trample state. I suppose
> that rcu_iw_pending thing ensures that... I'll think about it.

Yes, this is what I had in mind.  And you are right, the point of the
!rdp->rcu_iw_pending check is to prevent initialization while still
in use.

> > > The reason for this is that we get here only if a single grace
> > > period extends beyond 10.5 seconds (mainline) or beyond 30 seconds
> > > (many distribution kernels).  Which almost never happens.  And yes,
> > > rcutree_prepare_cpu() is also invoked as each CPU that comes online,
> > > not that this is all that common outside of rcutorture and boot time.  ;-)
> > 
> > What do you mean 'also' ? Afaict this is CPU bringup only code (initial
> > and hotplug). We really don't care about code there. It's the slowest
> > possible path we have in the kernel.

The "also" was acknowledging that a workload with lots of CPU hotplug
would also needlessly invoke IRQ_WORK_INIT_HARD() multiple times.

Thanx, Paul

> > > > -   atomic_set(>rcu_iw.flags, 
> > > > IRQ_WORK_HARD_IRQ);
> > > > rdp->rcu_iw_pending = true;
> > > > rdp->rcu_iw_gp_seq = rnp->gp_seq;
> > > > irq_work_queue_on(>rcu_iw, rdp->cpu);
> > 


Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-08-17 Thread peterz
On Mon, Aug 17, 2020 at 11:03:25AM +0200, pet...@infradead.org wrote:
> On Thu, Jul 23, 2020 at 09:14:11AM -0700, Paul E. McKenney wrote:
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1287,8 +1287,6 @@ static int rcu_implicit_dynticks_qs(stru
> > >   if (IS_ENABLED(CONFIG_IRQ_WORK) &&
> > >   !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq &&
> > >   (rnp->ffmask & rdp->grpmask)) {
> > > - init_irq_work(>rcu_iw, rcu_iw_handler);
> > 
> > We are actually better off with the IRQ_WORK_INIT_HARD() here rather
> > than unconditionally at boot.
> 
> Ah, but there isn't an init_irq_work() variant that does the HARD thing.

Ah you meant doing:

rdp->rcu_iw = IRQ_WORK_INIT_HARD(rcu_iw_handler)

But then it is non-obvious how that doesn't trample state. I suppose
that rcu_iw_pending thing ensures that... I'll think about it.

> > The reason for this is that we get here only if a single grace
> > period extends beyond 10.5 seconds (mainline) or beyond 30 seconds
> > (many distribution kernels).  Which almost never happens.  And yes,
> > rcutree_prepare_cpu() is also invoked as each CPU that comes online,
> > not that this is all that common outside of rcutorture and boot time.  ;-)
> 
> What do you mean 'also' ? Afaict this is CPU bringup only code (initial
> and hotplug). We really don't care about code there. It's the slowest
> possible path we have in the kernel.
> 
> > > - atomic_set(>rcu_iw.flags, IRQ_WORK_HARD_IRQ);
> > >   rdp->rcu_iw_pending = true;
> > >   rdp->rcu_iw_gp_seq = rnp->gp_seq;
> > >   irq_work_queue_on(>rcu_iw, rdp->cpu);
> 


Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-08-17 Thread peterz
On Thu, Jul 23, 2020 at 09:14:11AM -0700, Paul E. McKenney wrote:
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1287,8 +1287,6 @@ static int rcu_implicit_dynticks_qs(stru
> > if (IS_ENABLED(CONFIG_IRQ_WORK) &&
> > !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq &&
> > (rnp->ffmask & rdp->grpmask)) {
> > -   init_irq_work(>rcu_iw, rcu_iw_handler);
> 
> We are actually better off with the IRQ_WORK_INIT_HARD() here rather
> than unconditionally at boot.

Ah, but there isn't an init_irq_work() variant that does the HARD thing.

> The reason for this is that we get here only if a single grace
> period extends beyond 10.5 seconds (mainline) or beyond 30 seconds
> (many distribution kernels).  Which almost never happens.  And yes,
> rcutree_prepare_cpu() is also invoked as each CPU that comes online,
> not that this is all that common outside of rcutorture and boot time.  ;-)

What do you mean 'also' ? Afaict this is CPU bringup only code (initial
and hotplug). We really don't care about code there. It's the slowest
possible path we have in the kernel.

> > -   atomic_set(>rcu_iw.flags, IRQ_WORK_HARD_IRQ);
> > rdp->rcu_iw_pending = true;
> > rdp->rcu_iw_gp_seq = rnp->gp_seq;
> > irq_work_queue_on(>rcu_iw, rdp->cpu);



Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-07-25 Thread Peter Zijlstra
On Sat, Jul 25, 2020 at 01:58:28PM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
> > Get rid of the __call_single_node union and clean up the API a little
> > to avoid external code relying on the structure layout as much.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) 
> > ---
> >  drivers/gpu/drm/i915/i915_request.c |4 ++--
> >  include/linux/irq_work.h|   33 
> > +
> >  include/linux/irqflags.h|4 ++--
> >  kernel/bpf/stackmap.c   |2 +-
> >  kernel/irq_work.c   |   18 +-
> >  kernel/printk/printk.c  |6 ++
> >  kernel/rcu/tree.c   |3 +--
> >  kernel/time/tick-sched.c|6 ++
> >  kernel/trace/bpf_trace.c|2 +-
> >  9 files changed, 41 insertions(+), 37 deletions(-)
> > 
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -196,7 +196,7 @@ __notify_execute_cb(struct i915_request
> >  
> > llist_for_each_entry_safe(cb, cn,
> >   llist_del_all(>execute_cb),
> > - work.llnode)
> > + work.node.llist)
> > fn(>work);
> >  }
> >  
> > @@ -478,7 +478,7 @@ __await_execution(struct i915_request *r
> >  * callback first, then checking the ACTIVE bit, we serialise with
> >  * the completed/retired request.
> >  */
> > -   if (llist_add(>work.llnode, >execute_cb)) {
> > +   if (llist_add(>work.node.llist, >execute_cb)) {
> > if (i915_request_is_active(signal) ||
> > __request_in_flight(signal))
> > __notify_execute_cb_imm(signal);
> 
> Hm, so I was looking at picking up some of the low risk bits (patches #1, #2, 
> #4)
> from this series for v5.9, but the above hunk depends non-trivially on the
> linux-next DRM tree, in particular:
> 
>   1d9221e9d395: ("drm/i915: Skip signaling a signaled request")
>   3255e00edb91: ("drm/i915: Remove i915_request.lock requirement for 
> execution callbacks")
>   etc.
> 
> We could add it sans the i915 bits, but then we'd introduce a semantic 
> conflict in linux-next which isn't nice so close to the merge window.
> 
> One solution would be to delay this into the merge window to after the 
> DRM tree gets merged by Linus. Another would be to help out Stephen 
> with the linux-next merge.
> 
> What would be your preference?

The alternative is splitting the above change out into it's own patch
and see if Chris is willing to carry that in the DRM tree. IIRC these
'new' names should already work with the current code.

They're different names for the same field in that giant union thing.


Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-07-25 Thread Ingo Molnar


* Peter Zijlstra  wrote:

> Get rid of the __call_single_node union and clean up the API a little
> to avoid external code relying on the structure layout as much.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  drivers/gpu/drm/i915/i915_request.c |4 ++--
>  include/linux/irq_work.h|   33 +
>  include/linux/irqflags.h|4 ++--
>  kernel/bpf/stackmap.c   |2 +-
>  kernel/irq_work.c   |   18 +-
>  kernel/printk/printk.c  |6 ++
>  kernel/rcu/tree.c   |3 +--
>  kernel/time/tick-sched.c|6 ++
>  kernel/trace/bpf_trace.c|2 +-
>  9 files changed, 41 insertions(+), 37 deletions(-)
> 
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -196,7 +196,7 @@ __notify_execute_cb(struct i915_request
>  
>   llist_for_each_entry_safe(cb, cn,
> llist_del_all(>execute_cb),
> -   work.llnode)
> +   work.node.llist)
>   fn(>work);
>  }
>  
> @@ -478,7 +478,7 @@ __await_execution(struct i915_request *r
>* callback first, then checking the ACTIVE bit, we serialise with
>* the completed/retired request.
>*/
> - if (llist_add(>work.llnode, >execute_cb)) {
> + if (llist_add(>work.node.llist, >execute_cb)) {
>   if (i915_request_is_active(signal) ||
>   __request_in_flight(signal))
>   __notify_execute_cb_imm(signal);

Hm, so I was looking at picking up some of the low risk bits (patches #1, #2, 
#4)
from this series for v5.9, but the above hunk depends non-trivially on the
linux-next DRM tree, in particular:

  1d9221e9d395: ("drm/i915: Skip signaling a signaled request")
  3255e00edb91: ("drm/i915: Remove i915_request.lock requirement for execution 
callbacks")
  etc.

We could add it sans the i915 bits, but then we'd introduce a semantic 
conflict in linux-next which isn't nice so close to the merge window.

One solution would be to delay this into the merge window to after the 
DRM tree gets merged by Linus. Another would be to help out Stephen 
with the linux-next merge.

What would be your preference?

Thanks,

Ingo


Re: [RFC][PATCH 1/9] irq_work: Cleanup

2020-07-23 Thread Paul E. McKenney
On Wed, Jul 22, 2020 at 05:01:50PM +0200, Peter Zijlstra wrote:
> Get rid of the __call_single_node union and clean up the API a little
> to avoid external code relying on the structure layout as much.
> 
> Signed-off-by: Peter Zijlstra (Intel) 

As noted earlier, cleaning up that union is most welcome!

Tested-by: Paul E. McKenney 

One nit below, with that fixed:

Reviewed-by: Paul E. McKenney 

> ---
>  drivers/gpu/drm/i915/i915_request.c |4 ++--
>  include/linux/irq_work.h|   33 +
>  include/linux/irqflags.h|4 ++--
>  kernel/bpf/stackmap.c   |2 +-
>  kernel/irq_work.c   |   18 +-
>  kernel/printk/printk.c  |6 ++
>  kernel/rcu/tree.c   |3 +--
>  kernel/time/tick-sched.c|6 ++
>  kernel/trace/bpf_trace.c|2 +-
>  9 files changed, 41 insertions(+), 37 deletions(-)
> 
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -196,7 +196,7 @@ __notify_execute_cb(struct i915_request
>  
>   llist_for_each_entry_safe(cb, cn,
> llist_del_all(>execute_cb),
> -   work.llnode)
> +   work.node.llist)
>   fn(>work);
>  }
>  
> @@ -478,7 +478,7 @@ __await_execution(struct i915_request *r
>* callback first, then checking the ACTIVE bit, we serialise with
>* the completed/retired request.
>*/
> - if (llist_add(>work.llnode, >execute_cb)) {
> + if (llist_add(>work.node.llist, >execute_cb)) {
>   if (i915_request_is_active(signal) ||
>   __request_in_flight(signal))
>   __notify_execute_cb_imm(signal);
> --- a/include/linux/irq_work.h
> +++ b/include/linux/irq_work.h
> @@ -14,28 +14,37 @@
>   */
>  
>  struct irq_work {
> - union {
> - struct __call_single_node node;
> - struct {
> - struct llist_node llnode;
> - atomic_t flags;
> - };
> - };
> + struct __call_single_node node;
>   void (*func)(struct irq_work *);
>  };
>  
> +#define __IRQ_WORK_INIT(_func, _flags) (struct irq_work){\
> + .node = { .u_flags = (_flags), },   \
> + .func = (_func),\
> +}
> +
> +#define IRQ_WORK_INIT(_func) __IRQ_WORK_INIT(_func, 0)
> +#define IRQ_WORK_INIT_LAZY(_func) __IRQ_WORK_INIT(_func, IRQ_WORK_LAZY)
> +#define IRQ_WORK_INIT_HARD(_func) __IRQ_WORK_INIT(_func, IRQ_WORK_HARD_IRQ)
> +
> +#define DEFINE_IRQ_WORK(name, _f)\
> + struct irq_work name = IRQ_WORK_INIT(_f)
> +
>  static inline
>  void init_irq_work(struct irq_work *work, void (*func)(struct irq_work *))
>  {
> - atomic_set(>flags, 0);
> - work->func = func;
> + *work = IRQ_WORK_INIT(func);
>  }
>  
> -#define DEFINE_IRQ_WORK(name, _f) struct irq_work name = {   \
> - .flags = ATOMIC_INIT(0),\
> - .func  = (_f)   \
> +static inline bool irq_work_is_pending(struct irq_work *work)
> +{
> + return atomic_read(>node.a_flags) & IRQ_WORK_PENDING;
>  }
>  
> +static inline bool irq_work_is_busy(struct irq_work *work)
> +{
> + return atomic_read(>node.a_flags) & IRQ_WORK_BUSY;
> +}
>  
>  bool irq_work_queue(struct irq_work *work);
>  bool irq_work_queue_on(struct irq_work *work, int cpu);
> --- a/include/linux/irqflags.h
> +++ b/include/linux/irqflags.h
> @@ -95,12 +95,12 @@ do {  \
>  
>  # define lockdep_irq_work_enter(__work)  
> \
> do {  \
> -   if (!(atomic_read(&__work->flags) & IRQ_WORK_HARD_IRQ))\
> +   if (!(atomic_read(&__work->node.a_flags) & 
> IRQ_WORK_HARD_IRQ))\
>   current->irq_config = 1;\
> } while (0)
>  # define lockdep_irq_work_exit(__work)   
> \
> do {  \
> -   if (!(atomic_read(&__work->flags) & IRQ_WORK_HARD_IRQ))\
> +   if (!(atomic_read(&__work->node.a_flags) & 
> IRQ_WORK_HARD_IRQ))\
>   current->irq_config = 0;\
> } while (0)
>  
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -293,7 +293,7 @@ static void stack_map_get_build_id_offse
>   if (irqs_disabled()) {
>   if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
>   work = this_cpu_ptr(_read_work);
> - if (atomic_read(>irq_work.flags) & IRQ_WORK_BUSY) 
> {
> + if (irq_work_is_busy(>irq_work)) {
>   /* cannot queue more