Tracing Fans,
I know it's been a long time in coming but the CPU Performance
Counter (CPC) provider is almost here! The code is currently in
for review and a proposed architecture document is attached here
for review.
Any and all feedback/questions on the proposed implementation
is welcome.
Thanks.
Jon.
Template Version: @(#)sac_nextcase 1.66 04/17/08 SMI
This information is Copyright 2008 Sun Microsystems
1. Introduction
1.1. Project/Component Working Name:
DTrace CPU Performance Counter Provider
1.2. Name of Document Author/Supplier:
Author: Jonathan Haslam
1.3 Date of This Document:
03 July, 2008
4. Technical Description
A. INTRODUCTION
This case adds the 'cpc' provider which will enable consumers to access the
performance counters of a CPU. This will allow users to easily connect CPU
events (e.g. TLB misses, L2 cache misses) to the cause of the event on a
system-wide basis.
The Solaris CPU Performance Counter (CPC) subsystem (PSARC 2002/180) gives
general purpose access to the hardware performance counters of a
microprocessor. The cpc provider leverages the infrastructure provided by
the CPC subsystem to access the CPU performance counter resources of a system.
The provider utilises the hardware overflow interrupt mechanism to allow
profiling based upon CPU performance counter events (in the same way that
the profile provider allows us to profile by time).
B. DESCRIPTION
1. Probe Format
The format of probes made available by the cpc provider:
cpc:::event_name-mode-{optional mask}-count
where:
event_name: The event name of interest. A full list of events available
on each platform are given in the output of `cpustat -h`.
mode: The operating mode of the processor in which the event is
counted. Valid settings are "user" (user mode), "kernel"
(kernel mode) and "all" (user and kernel mode).
optional mask: Some platform specific events can be further specified with
the use of a mask (sometimes known as a 'umask' or an 'emask').
This field is optional and can only be specified for platform
specific events. It cannot be used with generic performance
counter events (PSARC 2008/334). Specified as a hex value.
count: Specifies the number of events to be counted on a CPU for a
probe to fire on that CPU.
As an example, the specification for a probe which fires every 10000 user mode
DTLB misses on an UltraSPARC IV processor would look like:
cpc:::DTLB_miss-user-10000
2. Probe arguments
All probes provide two arguments:
args[0] The program counter (PC) in the kernel at the time the probe
fired, or 0 if the current process was not executing in the
kernel at the time the probe fired.
args[1] The PC in the user-level process at the time the probe fired,
or 0 if the current process was executing in the kernel at the
time the probe fired.
3. Probe Availability
Probes are made available dynamically when requested by a user. The probes
available will differ according to the events exported by the CPC subsystem
on a platform. The names of available events can be discovered, as mentioned
in section 'B1 - Probe Format', using the output of `cpustat -h`.
CPU performance counters are a finite resource and the number of probes
that can be enabled depends upon hardware capabilities. Processors
that cannot determine which counter has overflowed when multiple counters
are programmed (e.g. AMD, UltraSPARC) are only allowed to have a single
enabling at any one time. Processors that can detect which counter has
overflowed (e.g. Niagara2, Intel P4) are allowed to have as many probes
enabled as the hardware will allow. This will be, at most, the number of
counters available on a processor. On such configurations, multiple probes
can be enabled at any one time.
Probes are enabled by consumers on a first-come, first-served basis. When
hardware resources are fully utilised subsequent enablings will fail until
resources become available.
3. Co-existence with existing tools
The provider has priority over per-LWP libcpc usage (i.e. cputrack)
for access to counters. In the same manner as cpustat, enabling probes
causes all existing per-LWP counter contexts to be invalidated. As long as
these enablings remain active, the counters will remain unavailable to
cputrack-type consumers.
Only one of cpustat and DTrace may use the counter hardware at any one time.
Ownership of the counters is given on a first-come, first-served basis.
4. Limiting Overflow Rate
So as to not saturate the system with overflow interrupts, a default minimum
of 10000 is imposed on the value that can be specified for the 'count'
part of the probename (refer to section 'B1 - Probe Format'). This can be
reduced explicitly by altering the 'dcpc_min_overflow' kernel variable with
mdb(1) or by modifying the dcpc.conf driver configuration file and unloading
and reloading the dcpc driver module.
C. EXAMPLES
1. Instructions executed by applications on an AMD platform:
cpc:::FR_retired_x86_instr_w_excp_intr-user-10000
{
@[execname] = count();
}
# ./user-insts.d
dtrace: script './user-insts.d' matched 2 probes
^C
[chop]
init 138
dtrace 175
nis_cachemgr 179
automountd 183
intrd 235
run-mozilla.sh 306
thunderbird 316
Xorg 453
thunderbird-bin 2370
sshd 8114
2. A kernel profiled by cycle usage on an AMD platform.
cpc:::BU_cpu_clk_unhalted-kernel-10000
{
@[func(arg0)] = count();
}
# ./kerncycprof.d
dtrace: script './kerncycprof.d' matched 1 probe
^C
[chop]
genunix`vpm_sync_pages 478948
genunix`vpm_unmap_pages 496626
genunix`vpm_map_pages 640785
unix`mutex_delay_default 916703
unix`hat_kpm_page2va 988880
tmpfs`rdtmp 991252
unix`hat_page_setattr 1077717
unix`page_try_reclaim_lock 1213379
genunix`free_vpmap 1914810
genunix`get_vpmap 2417896
unix`page_lookup_create 3992197
unix`mutex_enter 5595647
unix`do_copy_fault_nta 27803554
3. L2 cache misses, by function, generated by any running executables
called 'brendan' on an AMD platform.
cpc:::BU_fill_req_missed_L2-all-0x7-10000
/execname == "brendan"/
{
@[ufunc(arg1)] = count();
}
./brendan-l2miss.d
dtrace: script './brendan-l2miss.d' matched 1 probe
CPU ID FUNCTION:NAME
^C
brendan`func_gamma 930
brendan`func_beta 1578
brendan`func_alpha 2945
4. The same example as in example (3) above but using a generic event to
specify L2 data cache misses:
cpc:::PAPI_l2_dcm-all-10000
/execname == "brendan"/
{
@[ufunc(arg1)] = count();
}
# ./papi-l2miss.d
dtrace: script './papi-l2miss.d' matched 1 probe
^C
brendan`func_gamma 1681
brendan`func_beta 2521
brendan`func_alpha 5068
D. REFERENCES
http://bugs.opensolaris.org/view_bug.do?bug_id=6486156
PSARC/2002/180 CPU Performance Counters (CPC) Version 2
PSARC/2008/334 CPU Performance Counter Generic Event Names
E. DOCUMENTATION
A new chapter will be added to the current Solaris Dynamic Tracing Guide
for this proposed provider:
http://wikis.sun.com/display/DTrace/Documentation # DTrace Guide
F. STABILITY
The DTrace internal stability table is described below:
Element Name stability Data stability Dependency class
Provider Evolving Evolving Common
Module Private Private Unknown
Function Private Private Unknown
Name Evolving Evolving CPU
Arguments Evolving Evolving Common
6. Resources and Schedule
6.4. Steering Committee requested information
6.4.1. Consolidation C-team Name:
OS/Net
6.5. ARC review type: FastTrack
6.6. ARC Exposure: open
_______________________________________________
dtrace-discuss mailing list
[email protected]