On Thu,  4 Mar 2021 11:01:37 +0200
"Yordan Karadzhov (VMware)" <y.kar...@gmail.com> wrote:

Thanks Yordan for doing this!

I have some comments below.

> diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
> index 4547ac59da61..8007f9b6417f 100644
> --- a/kernel/trace/trace_entries.h
> +++ b/kernel/trace/trace_entries.h
> @@ -338,3 +338,19 @@ FTRACE_ENTRY(hwlat, hwlat_entry,
>                __entry->nmi_total_ts,
>                __entry->nmi_count)
>  );
> +
> +FTRACE_ENTRY(func_repeats, func_repeats_entry,
> +
> +     TRACE_FUNC_REPEATS,
> +
> +     F_STRUCT(
> +             __field(        unsigned long,  ip      )
> +             __field(        unsigned long,  pip     )
> +             __field(        unsigned long,  count   )
> +     ),
> +
> +     F_printk(" %ps <-%ps\t(repeats:%lu)",
> +              (void *)__entry->ip,
> +              (void *)__entry->pip,
> +              __entry->count)

After playing with this a little, I realized that we should also store the
last timestamp as well. I think that would be interesting information.

           <...>-37      [004] ...1  2022.303820: gc_worker <-process_one_work
           <...>-37      [004] ...1  2022.303820: ___might_sleep <-gc_worker
           <...>-37      [004] ...1  2022.303831: ___might_sleep <-gc_worker 
(repeats: 127)
           <...>-37      [004] ...1  2022.303831: queue_delayed_work_on 
<-process_one_work

The above shows that __might_sleep() was called 128 times, but what I don't
get from the above, is when that last call was made. You'll see that the
timestamp for the repeat output is the same as the next function shown
(queue_delayed_work_on()). But the timestamp for the last call to
__might_sleep() is lost, and the repeat event ends up being written when
it is detected that there are no more repeats.

If we had:

           <...>-37      [004] ...1  2022.303820: gc_worker <-process_one_work
           <...>-37      [004] ...1  2022.303820: ___might_sleep <-gc_worker
           <...>-37      [004] ...1  2022.303831: ___might_sleep <-gc_worker 
(last ts: 2022.303828 repeats: 127)
           <...>-37      [004] ...1  2022.303831: queue_delayed_work_on 
<-process_one_work

We would know the last time __might_sleep was called.

That is, not only should we save the ip and pip in the trace_func_repeats
structure, but we should also be storing the last time stamp of the last
function event that repeated. Otherwise the above looks like the last
__might_sleep called above happened when the queue_delayed_work_on
happened, where that may not be the case.

-- Steve


Reply via email to