So, what are extension events good for?

They have the desireable feature of signals: you can post one from almost
anywhere in the kernel, and the event gets processed at the safe place just
before returning to user mode (where you hold no kernel locks, can safely
use user_regset, etc.).  

But they are not signals, so we separate debugger magic from normal user
facilities.  Signals are a user resource (queue size rlimit, can be blocked
via sigprocmask, etc.), whereas utrace events are a debugger resource.

Use them instead of signals for hardware events induced by the debugger.
This can include single-step, and possibly breakpoint traps.  hw_breakpoint
hits, BTS buffer fills, and the like, when tracking user-mode activity can
be "requested" extension events.  All these things happen at low level, in
trap handlers, context switch, places down in the kernel where generalized
callbacks are not safe.  But they cite user-level events, so they can
safely be processed later, just before returning to user mode.

For a perfmon-like facility with in-kernel APIs, a performance counter
trigger for per-thread counters could produce an extension event.  

An example of a complex use of perfmon sort of stuff (either perf hw or
time-based profiling) would be some logic driven by performance counters
that decides "this thread is experiencing a hot spot".  Maybe it's doing PC
sampling (time-based or PEBS) and notices PC 0x123 was hit a million times
in the last two seconds.  On the millionth hit, it generates a "hot spot"
event.  (Such logic could be operating on particular threads, or running
system-wide.)  While the hot-spot detector code is chugging away inside the
kernel, you can attach the debugger to your favorite poor-performing
process, and say "break on hot spots".  On that millionth sample of the
same overused PC, your debugger will pop up as for a source breakpoint or
program fault, and say "here is the PC/regs/source code/backtrace of an
active hot spot".

That is, there was a "hot-spot" event type registered by that kernel code,
events are in a discoverable name space, and the debugger is using some
user-level interface on top of utrace that can do this discovery and let it
utrace_listen for arbitrary extension events on debugged threads.

That is one illustrative example of the concept behind extension events.
They provide a general rendezvous point between kernel code that is a
producer of events of interest to tracing, and utrace-based facilities
(either purely in-kernel or kernel-user interface layers) that are
consumers of traceable events tied to thread activity driven by user mode
(as all non-kthreads' activity ultimately is).  There can be many unrelated
pieces of tracing code in the kernel that register event types and post
events, either in subsystem code, arch code, other in-kernel tracing
facilities, or specialized kernel modules made just for tracing.  The
producers of events are not necessarily associated with any particular
consumer of events.  They are available diagnostics, provided for whoever
wants to keep track of them.  (It's intended to be very cheap to post an
event that noone is listening for on the current task, maybe two loads and
a branch not taken.  This is not much worse than the ptrace/utrace "core"
events overhead, which is just:
        if (unlikely(current->flag_word & CONSTANT))
The fast path of utrace_post_event will probably be an inlined:
        if (unlikely(current->flag_word & global.mask_word))
So possibly it's cheap enough to insert unconditionally in some subsystems
or whatnot.  There is an obvious overlap with tracepoints/markers/etc
here, so synergies should be investigated.)

Not that we should get into this here, but in the long run, I would like to
see systemtap support extension events as a first-class script language
construct.  Use this in tapsets or in special-purpose scripts, to do:

event lucky13;
probe net.send { if ($len == 13) throw event lucky13 }

Running this script makes "catch lucky13" available in your debugger,
"graph incidents of lucky13" available in your whole-session-watcher, etc.
(I don't mean to get into systemtap feature ideas here, just to illustrate
the open-ended range of uses for extension events.)


Thanks,
Roland

Reply via email to