Stephane Eranian wrote:
Will,
On Tue, Sep 19, 2006 at 04:56:01PM -0400, William Cohen wrote:
At the very least there needs to be a mechanism to read the values of the
performance monitoring hardware registers in kernel-space. Certainly people
have used get_cycles() to see how long certain things take to do within the
kernel. Having access to the performance monitoring counters would allow
better testing of some hypothesis, e.g. were there fewer or more cache
misses with this approach versus another approach. It isn't practical to do
the read of the performance counter in user-space. Too bad that the
performance hardware designers for most processors took short cuts, so that
a simple direct reading of the perfmon hardware data counters won't work.
ou can read any raw performance counters in kernel space using the appropriate
Yassembly instruction. On x86 that would be rdmsr/rdpmc. Of course, that would
not give you the full 64-bit (software virtualized) value. But I suspect that
in-kernel you are after micro-mesasurements that are unlikely to run long
enough to overflow a 32-bit counter (especially if not measuing cycles).
I think you are after a small subset of the calls from perfmon2, namely
start/stop, read counters. I think the setup/tear-down could be done at the
user level, i.e., you'd have to assume there is a session going. If we
further assume system-wide ONLY and that you can only operate on the cpu where
you issue the call, then it would not be too difficult to add the 3 calls you
need.
For sampling in the kernel the existing user-space infrastructure can be used.
The main place that a perfmon kernel API would be desired is for the interval
measurements, e.g. how many events occurred between this place and another place
in the code. Maybe allow custom buffering mechanism to be triggered by a
software event, allow it to place information in the header so the location in
the code is known. The program counter might be used as the identifier. Can the
custom buffer mechanism be called directly? What locks are held when elements
are written into the custom buffer? Having locks in the custom buffer might
cause problems.
This would change the mechanism the current implemented mechanism in systemtap
for using perfmon hw. However, it would work simplify the issue loading the
perfmon kernel module because that would just happen as a side-effect of the
user-space setup.
-Will
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/