On Tue, 6 Jan 2026 17:04:05 -0500
Steven Rostedt <[email protected]> wrote:

> On Tue,  6 Jan 2026 10:10:39 +0100
> Petr Tesarik <[email protected]> wrote:
> 
> > Avoid running the wakeup irq_work on an isolated CPU. Since the wakeup can
> > run on any CPU, let's pick a housekeeping CPU to do the job.
> > 
> > This change reduces additional noise when tracing isolated CPUs. For
> > example, the following ipi_send_cpu stack trace was captured with
> > nohz_full=2 on the isolated CPU:
> > 
> >           <idle>-0       [002] d.h4.  1255.379293: ipi_send_cpu: cpu=2 
> > callsite=irq_work_queue+0x2d/0x50 callback=rb_wake_up_waiters+0x0/0x80
> >           <idle>-0       [002] d.h4.  1255.379329: <stack trace>  
> >  => trace_event_raw_event_ipi_send_cpu
> >  => __irq_work_queue_local
> >  => irq_work_queue
> >  => ring_buffer_unlock_commit
> >  => trace_buffer_unlock_commit_regs
> >  => trace_event_buffer_commit
> >  => trace_event_raw_event_x86_irq_vector
> >  => __sysvec_apic_timer_interrupt
> >  => sysvec_apic_timer_interrupt
> >  => asm_sysvec_apic_timer_interrupt
> >  => pv_native_safe_halt
> >  => default_idle
> >  => default_idle_call
> >  => do_idle
> >  => cpu_startup_entry
> >  => start_secondary
> >  => common_startup_64    
> 
> I take it that even with this patch you would still get the above events.
> The only difference would be the "cpu=" in the event info will not be the
> same as the CPU it executed on, right?

Yes, this is trace of a similar event after applying the patch:

          <idle>-0       [002] d.h4.   313.334367: ipi_send_cpu: cpu=1 
callsite=irq_work_queue_on+0x55/0x90 
callback=generic_smp_call_function_single_interrupt+0x0/0x20
          <idle>-0       [002] d.h4.   313.334390: <stack trace>
 => trace_event_raw_event_ipi_send_cpu
 => __smp_call_single_queue
 => irq_work_queue_on
 => ring_buffer_unlock_commit
 => trace_buffer_unlock_commit_regs
 => trace_event_buffer_commit
 => trace_event_raw_event_x86_irq_vector
 => __sysvec_apic_timer_interrupt
 => sysvec_apic_timer_interrupt
 => asm_sysvec_apic_timer_interrupt
 => pv_native_safe_halt
 => default_idle
 => default_idle_call
 => do_idle
 => cpu_startup_entry
 => start_secondary
 => common_startup_64

The callback function in the trace event is different. That's because
send_call_function_single_ipi() always uses this value. Maybe it can be
improved, and I can look into it, but that's clearly a very separate
issue.

> > The IRQ work interrupt alone adds considerable noise, but the impact can
> > get even worse with PREEMPT_RT, because the IRQ work interrupt is then
> > handled by a separate kernel thread. This requires a task switch and makes
> > tracing useless for analyzing latency on an isolated CPU.
> > 
> > Signed-off-by: Petr Tesarik <[email protected]>  
> 
> LGTM,
> 
> I'll queue it up for the next merge window.

Thank you!

Petr T

Reply via email to