On Thu, 4 Mar 2021 11:01:37 +0200 "Yordan Karadzhov (VMware)" <y.kar...@gmail.com> wrote:
Thanks Yordan for doing this! I have some comments below. > diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h > index 4547ac59da61..8007f9b6417f 100644 > --- a/kernel/trace/trace_entries.h > +++ b/kernel/trace/trace_entries.h > @@ -338,3 +338,19 @@ FTRACE_ENTRY(hwlat, hwlat_entry, > __entry->nmi_total_ts, > __entry->nmi_count) > ); > + > +FTRACE_ENTRY(func_repeats, func_repeats_entry, > + > + TRACE_FUNC_REPEATS, > + > + F_STRUCT( > + __field( unsigned long, ip ) > + __field( unsigned long, pip ) > + __field( unsigned long, count ) > + ), > + > + F_printk(" %ps <-%ps\t(repeats:%lu)", > + (void *)__entry->ip, > + (void *)__entry->pip, > + __entry->count) After playing with this a little, I realized that we should also store the last timestamp as well. I think that would be interesting information. <...>-37 [004] ...1 2022.303820: gc_worker <-process_one_work <...>-37 [004] ...1 2022.303820: ___might_sleep <-gc_worker <...>-37 [004] ...1 2022.303831: ___might_sleep <-gc_worker (repeats: 127) <...>-37 [004] ...1 2022.303831: queue_delayed_work_on <-process_one_work The above shows that __might_sleep() was called 128 times, but what I don't get from the above, is when that last call was made. You'll see that the timestamp for the repeat output is the same as the next function shown (queue_delayed_work_on()). But the timestamp for the last call to __might_sleep() is lost, and the repeat event ends up being written when it is detected that there are no more repeats. If we had: <...>-37 [004] ...1 2022.303820: gc_worker <-process_one_work <...>-37 [004] ...1 2022.303820: ___might_sleep <-gc_worker <...>-37 [004] ...1 2022.303831: ___might_sleep <-gc_worker (last ts: 2022.303828 repeats: 127) <...>-37 [004] ...1 2022.303831: queue_delayed_work_on <-process_one_work We would know the last time __might_sleep was called. That is, not only should we save the ip and pip in the trace_func_repeats structure, but we should also be storing the last time stamp of the last function event that repeated. Otherwise the above looks like the last __might_sleep called above happened when the queue_delayed_work_on happened, where that may not be the case. -- Steve