Is there any interest in having support for more accurate and efficient 
counter virtualization added to perfmon2?

By more accurate, we mean providing an option to exclude time spent in 
interrupts from per-thread time.
By more efficient, we mean providing a way for user-space tools to read a 
mapped data area 
where perfmon would write the values of performance monitoring counters at 
the last significant event (interrupt exit/dispatch)
for each thread.

This is the approach we use for our Performance Inspector toolset (
http://sourceforge.net/projects/perfinsp/):
the Performance Inspector kernel driver virtualizes counters by thread by 
dynamically patching the dispatcher and interrupt entries/exits.
The Java profiler, jprof, gets the virtualized counter values on every 
method entry and method exit using JVMPI or JVMTI support,
so it can produce per-method reports. 
It can also collect these values for C-code that has been recompiled to 
issue function entry/exit notifications. 
The current algorithm for per thread metrics keeps the 64-bit values 
accumulated by the device driver code in a mapped thread area 
that allows for the reads of the performance counters to be done 
efficiently in application mode 
as opposed to requiring a transition to kernel mode using system calls. 

Since there is a fairly high probability of perfmon2 being accepted into 
the mainline kernel,
we would like to use the interfaces it provides.
However, we believe a couple of features may be added to perfmon2
to provide the same functionality of our tools.
We would like to provide the support for these features if there is 
interest for them in the community.

Milena Milenkovic
WebSphere BPM & C Performance Tools
[EMAIL PROTECTED]
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to