On 11/06/26 10:59, Tomas Glozar wrote: > [just replying to comments, will do a full review later] > > st 10. 6. 2026 v 21:51 odesÃlatel Crystal Wood <[email protected]> napsal: >> >> On Wed, 2026-06-10 at 15:04 +0200, Valentin Schneider wrote: >> > Osnoise already implictly accounts IPIs via its IRQ tracking, >> >> Does it? It seems that IPIs bypass the kernel/irq subsystem on some >> arches (including x86, but not ARM). >> >> It would be nice to solve this properly by adding generic ipi >> entry/exit tracing (similar to what ARM already has). >> > > Isn't that precisely what the ipi tracepoints used by this > implementation (ipi:ipi_send_cpu) are for? >
Well, these catch the emission of the IPI, which is great for investigation - slap a stacktrace trigger and you (most of the time) get the source of your interference. However Crystal's point is that on x86 (and I assume other archs) receiving & handling these IPIs is "special" and doesn't go through the generic irq subsystem and thus has to be tracked separately, which is why osnoise has this fairly lengthy osnoise_arch_register() thing. >> > however it >> > can be interesting to distiguish between the two: undesired IPIs usually >> > imply a software configuration issue (e.g. wrong/incomplete CPU isolation) >> > whereas undesired (non-IPI) IRQs usually imply a hardware configuration >> > issue. >> > >> > Signed-off-by: Valentin Schneider <[email protected]> >> > --- >> > Note that this is modifying the osnoise:osnoise_entry Ftrace entry; I know >> > trace events are sort of supposed to be stable, but I'm not sure about >> > ftrace entries. >> >> I think old rtla will be OK with this since it looks up fields by name >> rather than assuming a fixed layout. >> > > Yeah, the fields are either looked up with tep_get_field_val() [2], or > with name-based BPF CO-RE relocations against the tracepoint structure > [3]. So this shouldn't be an issue, as long as the old counts stay the > same. > > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/tools/tracing/rtla/src/timerlat_hist.c#n191 > [3] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/tools/tracing/rtla/src/timerlat.bpf.c#n12 > >> > Alternatively I can have this be purely supported in userspace osnoise by >> > hooking into the IPI events and counting IPIs separately from the osnoise >> > events. >> >> One benefit I could see of doing this in kernel osnoise would be if you >> could atomically correlate the count with the particular noise >> interval, but this patch doesn't do that. >> > > The count is already reported by cycle on the kernel side in the > patchset, right? It's only missing in the current RTLA (userspace) > part, as there is no statistic using the information. But it can still > be collected through custom histogram triggers. > >> > ... >> > >> > +static void trace_ipi_send_cpumask_callback(void *data, const struct >> > cpumask *cpumask, >> > + unsigned long callsite, void >> > *callback) >> > +{ >> > + struct osnoise_variables *osn_var; >> > + int cpu; >> > + >> > + for_each_cpu_and(cpu, cpumask, &osnoise_cpumask) { >> > + osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu); >> > + ipi_emission(osn_var, cpu); >> > + } >> > +} >> >> Isn't this racy to do from a different CPU? Both in terms of the >> counter, and the timing of the increment relative to when the IPI is >> actually received. Not necessarily a huge deal if you only care about >> zero versus bignum, but still. At least worth a comment, if we go with >> this approach. >> > > I also think it's a bit confusing, especially as the other accesses to > osn_var are cpu-local, but here, "cpu" is the *target* CPU, not the > current CPU. Not sure how expensive it would be to do atomic_add for > that, at least it's something to consider. > I suppose that could be an argument for doing that stat aggregation in userspace osnoise - event handlers are run after the fact via tracefs_iterate_raw_events(), it's all inherently slower since it's just increments of one (one per handled event) but it's also all done in userspace on a control thread and doesn't bog down the kernelspace. > Tomas
