Stephane Eranian wrote:
Will,

On Tue, Sep 19, 2006 at 04:56:01PM -0400, William Cohen wrote:

At the very least there needs to be a mechanism to read the values of the performance monitoring hardware registers in kernel-space. Certainly people have used get_cycles() to see how long certain things take to do within the kernel. Having access to the performance monitoring counters would allow better testing of some hypothesis, e.g. were there fewer or more cache misses with this approach versus another approach. It isn't practical to do the read of the performance counter in user-space. Too bad that the performance hardware designers for most processors took short cuts, so that a simple direct reading of the perfmon hardware data counters won't work.


ou can read any raw performance counters in kernel space using the appropriate Yassembly instruction. On x86 that would be rdmsr/rdpmc. Of course, that would not give you the full 64-bit (software virtualized) value. But I suspect that in-kernel you are after micro-mesasurements that are unlikely to run long
enough to overflow a 32-bit counter (especially if not measuing cycles).

I think you are after a small subset of the calls from perfmon2, namely
start/stop, read counters. I think the setup/tear-down could be done at the
user level, i.e., you'd have to assume there is a session going. If we
further assume system-wide ONLY and that you can only operate on the cpu where
you issue the call, then it would not be too difficult to add the 3 calls you 
need.

For sampling in the kernel the existing user-space infrastructure can be used. The main place that a perfmon kernel API would be desired is for the interval measurements, e.g. how many events occurred between this place and another place in the code. Maybe allow custom buffering mechanism to be triggered by a software event, allow it to place information in the header so the location in the code is known. The program counter might be used as the identifier. Can the custom buffer mechanism be called directly? What locks are held when elements are written into the custom buffer? Having locks in the custom buffer might cause problems.

This would change the mechanism the current implemented mechanism in systemtap for using perfmon hw. However, it would work simplify the issue loading the perfmon kernel module because that would just happen as a side-effect of the user-space setup.

-Will
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to