Re: [patch 2/2] nohz: try to avoid IPI when setting tick dependency for task
On Thu, Sep 03, 2020 at 05:01:53PM +0200, Frederic Weisbecker wrote: > On Tue, Aug 25, 2020 at 03:41:49PM -0300, Marcelo Tosatti wrote: > > When enabling per-CPU posix timers, an IPI to nohz_full CPUs might be > > performed (to re-read the dependencies and possibly not re-enter > > nohz_full on a given CPU). > > > > A common case is for applications that run on nohz_full= CPUs > > to not use POSIX timers (eg DPDK). > > > > This patch optimizes tick_nohz_dep_set_task to avoid kicking > > all nohz_full= CPUs in case the task allowed mask does not > > intersect with nohz_full= CPU mask, > > when going through tick_nohz_dep_set_task. > > > > This reduces interruptions to nohz_full= CPUs. > > > > --- > > kernel/time/tick-sched.c |9 +++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > Index: linux-2.6/kernel/time/tick-sched.c > > === > > --- linux-2.6.orig/kernel/time/tick-sched.c > > +++ linux-2.6/kernel/time/tick-sched.c > > @@ -383,11 +383,16 @@ void tick_nohz_dep_set_task(struct task_ > > tick_nohz_full_kick(); > > preempt_enable(); > > } else { > > + unsigned long flags; > > + > > /* > > * Some future tick_nohz_full_kick_task() > > -* should optimize this. > > +* should further optimize this. > > */ > > - tick_nohz_full_kick_all(); > > + raw_spin_lock_irqsave(&tsk->pi_lock, flags); > > + if (cpumask_intersects(&tsk->cpus_mask, > > tick_nohz_full_mask)) > > + tick_nohz_full_kick_all(); > > + raw_spin_unlock_irqrestore(&tsk->pi_lock, flags); > > } > > } > > } > > > > > > Not long ago, Peterz suggested that we simply do: > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index f0199a4ba1ad..42ce8e458013 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -357,17 +357,26 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu); > void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit) > { > if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) { > - if (tsk == current) { > - preempt_disable(); > - tick_nohz_full_kick(); > - preempt_enable(); > - } else { > - /* > - * Some future tick_nohz_full_kick_task() > - * should optimize this. > - */ > - tick_nohz_full_kick_all(); > - } > + int cpu = task_cpu(tsk); > + > + /* > + * If the task concurrently migrates to another cpu, > + * we guarantee it sees the new tick dependency upon > + * schedule. > + * > + * set_task_cpu(p, cpu); > + * STORE p->cpu = @cpu > + * __schedule() (switch to task 'p') > + * LOCK rq->lock > + * smp_mb__after_spin_lock() STORE p->tick_dep_mask > + * tick_nohz_task_switch()smp_mb() > (atomic_fetch_or()) > + * LOAD p->tick_dep_mask LOAD p->cpu > + */ > + > + preempt_disable(); > + if (cpu_online(cpu)) > + tick_nohz_full_kick_cpu(cpu); > + preempt_enable(); > } > } > EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task); This can also be used for the signal case... thanks.
Re: [patch 2/2] nohz: try to avoid IPI when setting tick dependency for task
On Tue, Aug 25, 2020 at 03:41:49PM -0300, Marcelo Tosatti wrote: > When enabling per-CPU posix timers, an IPI to nohz_full CPUs might be > performed (to re-read the dependencies and possibly not re-enter > nohz_full on a given CPU). > > A common case is for applications that run on nohz_full= CPUs > to not use POSIX timers (eg DPDK). > > This patch optimizes tick_nohz_dep_set_task to avoid kicking > all nohz_full= CPUs in case the task allowed mask does not > intersect with nohz_full= CPU mask, > when going through tick_nohz_dep_set_task. > > This reduces interruptions to nohz_full= CPUs. > > --- > kernel/time/tick-sched.c |9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > Index: linux-2.6/kernel/time/tick-sched.c > === > --- linux-2.6.orig/kernel/time/tick-sched.c > +++ linux-2.6/kernel/time/tick-sched.c > @@ -383,11 +383,16 @@ void tick_nohz_dep_set_task(struct task_ > tick_nohz_full_kick(); > preempt_enable(); > } else { > + unsigned long flags; > + > /* >* Some future tick_nohz_full_kick_task() > - * should optimize this. > + * should further optimize this. >*/ > - tick_nohz_full_kick_all(); > + raw_spin_lock_irqsave(&tsk->pi_lock, flags); > + if (cpumask_intersects(&tsk->cpus_mask, > tick_nohz_full_mask)) > + tick_nohz_full_kick_all(); > + raw_spin_unlock_irqrestore(&tsk->pi_lock, flags); > } > } > } > > Not long ago, Peterz suggested that we simply do: diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index f0199a4ba1ad..42ce8e458013 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -357,17 +357,26 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cpu); void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit) { if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) { - if (tsk == current) { - preempt_disable(); - tick_nohz_full_kick(); - preempt_enable(); - } else { - /* -* Some future tick_nohz_full_kick_task() -* should optimize this. -*/ - tick_nohz_full_kick_all(); - } + int cpu = task_cpu(tsk); + + /* +* If the task concurrently migrates to another cpu, +* we guarantee it sees the new tick dependency upon +* schedule. +* +* set_task_cpu(p, cpu); +* STORE p->cpu = @cpu +* __schedule() (switch to task 'p') +* LOCK rq->lock +* smp_mb__after_spin_lock() STORE p->tick_dep_mask +* tick_nohz_task_switch()smp_mb() (atomic_fetch_or()) +* LOAD p->tick_dep_mask LOAD p->cpu +*/ + + preempt_disable(); + if (cpu_online(cpu)) + tick_nohz_full_kick_cpu(cpu); + preempt_enable(); } } EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task);
[patch 2/2] nohz: try to avoid IPI when setting tick dependency for task
When enabling per-CPU posix timers, an IPI to nohz_full CPUs might be performed (to re-read the dependencies and possibly not re-enter nohz_full on a given CPU). A common case is for applications that run on nohz_full= CPUs to not use POSIX timers (eg DPDK). This patch optimizes tick_nohz_dep_set_task to avoid kicking all nohz_full= CPUs in case the task allowed mask does not intersect with nohz_full= CPU mask, when going through tick_nohz_dep_set_task. This reduces interruptions to nohz_full= CPUs. --- kernel/time/tick-sched.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Index: linux-2.6/kernel/time/tick-sched.c === --- linux-2.6.orig/kernel/time/tick-sched.c +++ linux-2.6/kernel/time/tick-sched.c @@ -383,11 +383,16 @@ void tick_nohz_dep_set_task(struct task_ tick_nohz_full_kick(); preempt_enable(); } else { + unsigned long flags; + /* * Some future tick_nohz_full_kick_task() -* should optimize this. +* should further optimize this. */ - tick_nohz_full_kick_all(); + raw_spin_lock_irqsave(&tsk->pi_lock, flags); + if (cpumask_intersects(&tsk->cpus_mask, tick_nohz_full_mask)) + tick_nohz_full_kick_all(); + raw_spin_unlock_irqrestore(&tsk->pi_lock, flags); } } }