Hi folks, So I've seen a few times now reports of latency spikes caused by IPIs, usually because of isolation misconfiguration, but only detected at the tail of end e.g. a 24h timerlat run.
It's not because those IPIs are rare, but rather that they don't by themselves cause a monitered CPU to reach the latency threshold, it's usually a combined interference that gets us there. I'd like to make it easier to detect such misconfigurations and thus IPIs hitting supposedly-isolated CPUs. I initially kludged a timerlat option to stop tracing as soon as an IPI was sent to a monitored CPU, regardless of the latency threshold. It sort of did the trick, but Tomáš convinced me timerlat wasn't really the place for that. So here's IPI tracking added to osnoise. Two things worth pointing out: o This only adds IPI count tracking, nothing about noise duration - this is already tracked as part of the IRQ noise. o This modifies the osnoise Ftrace entry, I have no idea how acceptable this is, although the only real consumer of these should be rtla... Tested with: $ rtla osnoise top -d 5s $ trace-cmd record -p osnoise hackbench -l 10000 Cheers, Valentin Valentin Schneider (2): tracing/osnoise: Sample IPI counts rtla/osnoise: Report IPI count in osnoise top include/trace/events/osnoise.h | 1 + kernel/trace/trace_entries.h | 6 ++- kernel/trace/trace_osnoise.c | 80 ++++++++++++++++++++++++++-- tools/tracing/rtla/src/osnoise_top.c | 9 +++- 4 files changed, 88 insertions(+), 8 deletions(-) -- 2.54.0
