This is v2 of my previously posted 'hashtriggers' patchset [1], but
renamed to 'hist triggers' following feedback from v1.

Since then, the kernel has gained a tracing map implementation in the
form of bpf_map, which this patchset makes a bit more generic, exports
and uses (as tracing_map_*, still in the bpf syscall file however).

A large part of the initial hash triggers implementation was devoted
to a map implementation and general-purpose hashing functions, which
have now been subsumed by the bpf maps.  I've completely redone the
trigger patches themselves to work on top of tracing_map.  The result
is a much simpler and easier-to-review patchset that's able to focus
more directly on the problem at hand.

The new version addresses all the comments from the previous review,
including changing the name from hash->hist, adding separate 'hist'
files for the output, and moving the examples into Documentation.

This patchset also includes a couple other new and related triggers,
enable_hist and disable_hist, very similar to the existing
enable_event/disable_event triggers used to automatically enable and
disable events based on a triggering condition, but in this case
allowing hist triggers to be enabled and disabled in the same way.

The only problem with using the bpf_map implementation for this is
that it uses kmalloc internally, which causes problems when trying to
trace kmalloc itself.  I'm guessing the ebpf tracing code would also
share this problem e.g. when using bpf_maps from probes on kmalloc().
This patchset attempts a solution to that problem (by adding a
gfp_flag and changing the kmem memory allocation tracepoints to
conditional variants) for checking for it in for but I'm not sure it's
the best way to address it.

There are a couple of important bits of functionality that were
present in v1 but dropped in v2 mainly because I'm still trying to
figure out the best way to accomplish those things using the bpf_map
implementation.

The first is support for compound keys.  Currently, maps can only be
keyed on a single event field, whereas in v1 they could be keyed on
multiple keys.  With support for compound keys, you can create much
more interesting output, such as for example per-pid lists of
syscalls or read counts e.g.:

  # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
        /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger

  # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist

  key: common_pid:bash[3112], id:sys_write                     vals: count:69
  key: common_pid:bash[3112], id:sys_rt_sigprocmask            vals: count:218

  key: common_pid:update-notifier[3164], id:sys_poll           vals: count:37
  key: common_pid:update-notifier[3164], id:sys_recvfrom       vals: count:118

  key: common_pid:deja-dup-monito[3194], id:sys_sendto         vals: count:1
  key: common_pid:deja-dup-monito[3194], id:sys_read           vals: count:4
  key: common_pid:deja-dup-monito[3194], id:sys_poll           vals: count:8
  key: common_pid:deja-dup-monito[3194], id:sys_recvmsg        vals: count:8
  key: common_pid:deja-dup-monito[3194], id:sys_getegid        vals: count:8

  key: common_pid:emacs[3275], id:sys_fsync                    vals: count:1
  key: common_pid:emacs[3275], id:sys_open                     vals: count:1
  key: common_pid:emacs[3275], id:sys_symlink                  vals: count:2
  key: common_pid:emacs[3275], id:sys_poll                     vals: count:23
  key: common_pid:emacs[3275], id:sys_select                   vals: count:23
  key: common_pid:emacs[3275], id:unknown_syscall              vals: count:34
  key: common_pid:emacs[3275], id:sys_ioctl                    vals: count:60
  key: common_pid:emacs[3275], id:sys_rt_sigprocmask           vals: count:116

  key: common_pid:cat[3323], id:sys_munmap                     vals: count:1
  key: common_pid:cat[3323], id:sys_fadvise64                  vals: count:1

Related to that is support for sorting on multiple fields.  Currently,
you can sort using only a primary key.  Being able to sort on multiple
or at least a secondary key is indispensible for seeing trends when
displaying multiple values.

[1] http://thread.gmane.org/gmane.linux.kernel/1673551

Changes from v1:
 - completely rewritten on top of tracing_map (renamed and exported bpf_map)
 - added map clearing and client ops to tracing_map
 - changed the name from 'hash' triggers to 'hist' triggers
 - added new trigger 'pause' feature
 - added new enable_hist and disable_hist triggers
 - added usage for hist/enable_hist/disable hist to tracing/README
 - moved examples into Documentation/trace/event.txt
 - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints

The following changes since commit 49058038a12cfd9044146a1bf4b286781268d5c9:

  ring-buffer: Do not wake up a splice waiter when page is not full (2015-02-24 
14:00:41 -0600)

are available in the git repository at:

  git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v2
  
http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v2

Tom Zanussi (15):
  tracing: Make ftrace_event_field checking functions available
  tracing: Add event record param to trigger_ops.func()
  tracing: Add get_syscall_name()
  bpf: Export bpf map functionality as trace_map_*
  bpf: Export a map-clearing function
  bpf: Add tracing_map client ops
  mm: Add ___GFP_NOTRACE
  tracing: Make kmem memory allocation tracepoints conditional
  tracing: Add kmalloc/kfree macros
  bpf: Make tracing_map use kmalloc/kfree_notrace()
  tracing: Add a per-event-trigger 'paused' field
  tracing: Add 'hist' event trigger command
  tracing: Add sorting to hist triggers
  tracing: Add enable_hist/disable_hist triggers
  tracing: Add 'hist' trigger Documentation

 Documentation/trace/events.txt      |  870 +++++++++++++++++++++
 include/linux/bpf.h                 |   15 +
 include/linux/ftrace_event.h        |    9 +-
 include/linux/gfp.h                 |    3 +-
 include/linux/slab.h                |   61 +-
 include/trace/events/kmem.h         |   28 +-
 kernel/bpf/arraymap.c               |   16 +
 kernel/bpf/hashtab.c                |   39 +-
 kernel/bpf/syscall.c                |  193 ++++-
 kernel/trace/trace.c                |   48 ++
 kernel/trace/trace.h                |   25 +-
 kernel/trace/trace_events.c         |    3 +
 kernel/trace/trace_events_filter.c  |   15 +-
 kernel/trace/trace_events_trigger.c | 1466 ++++++++++++++++++++++++++++++++++-
 kernel/trace/trace_syscalls.c       |   11 +
 mm/slab.c                           |   45 +-
 mm/slob.c                           |   45 +-
 mm/slub.c                           |   47 +-
 18 files changed, 2795 insertions(+), 144 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to