Re: [perfmon] New features proposal

Milena Milenkovic Thu, 18 Oct 2007 14:46:36 -0700

Hi Stephane,

Thanks for your prompt answer,
we will allocate some time for these features in the plans for the next 
year.

Looking forward to contributing to perfmon2,

Milena

Stephane Eranian <[EMAIL PROTECTED]> 
Sent by: [EMAIL PROTECTED]
10/18/2007 11:31 AM
Please respond to
[EMAIL PROTECTED]

To
Milena Milenkovic/Austin/[EMAIL PROTECTED]
cc
[EMAIL PROTECTED]
Subject
Re: [perfmon] New features proposal

Hello Milena,

On Wed, Oct 17, 2007 at 09:34:05AM -0500, Milena Milenkovic wrote:
> Is there any interest in having support for more accurate and efficient 
> counter virtualization added to perfmon2?
> 
> By more accurate, we mean providing an option to exclude time spent in 
> interrupts from per-thread time.

I assume you mean turning on/off monitoring around interrupt handlers.

Several months ago, I looked into how to turn monitoring on/off around the
idle loop (i.e., the actual mwait()). It turned out to be quite expensive
especially on x86 where clearing MSRs is a very slow operation (several
hundreds of cycles). Just like for interrupt handlers, the idea was to
exclude useless execution from being monitored, because some counters
actually counts during mwait(). 

I am not against the idea. In fact on Itanium, the hardware can do this
automatically, so there is no penalty. On this architecture, perfmon 
supports this
for system-wide contexts only. You simply pass a flag when you create the 
perfmon
session. I think this can be implemented in the same way on other 
platforms.

There is simply a question of cost compared to the execution time of the
interrupt handler. I think it would be worth investigating. If it turns
out to be both useful and efficient, then I would have no problem adding 
it
although I still think hardware support is much better.

> By more efficient, we mean providing a way for user-space tools to read 
a 
> mapped data area where perfmon would write the values of performance 
monitoring
> counters at the last significant event (interrupt exit/dispatch) for 
each thread.
> 
> This is the approach we use for our Performance Inspector toolset (
> http://sourceforge.net/projects/perfinsp/):
> the Performance Inspector kernel driver virtualizes counters by thread 
by 
> dynamically patching the dispatcher and interrupt entries/exits.

Perfmon does provide per-thread monitoring ("counter virtualization") by
saving/restore counters on context switches and via hooks on fork and 
exit.

> The Java profiler, jprof, gets the virtualized counter values on every 
> method entry and method exit using JVMPI or JVMTI support,
> so it can produce per-method reports. 
> It can also collect these values for C-code that has been recompiled to 
> issue function entry/exit notifications. 
> The current algorithm for per thread metrics keeps the 64-bit values 
> accumulated by the device driver code in a mapped thread area 
> that allows for the reads of the performance counters to be done 
> efficiently in application mode 
> as opposed to requiring a transition to kernel mode using system calls. 
> 

That makes sense. It seems that you may not need to read the data just 
when
you exit the function. You maybe able to read from the buffer at a later 
time
(as long as you can correlate with the function name, using instruction 
pointer).

> Since there is a fairly high probability of perfmon2 being accepted into 

> the mainline kernel,
> we would like to use the interfaces it provides.

> However, we believe a couple of features may be added to perfmon2
> to provide the same functionality of our tools.

> We would like to provide the support for these features if there is 
> interest for them in the community.
> 
Note that perfmon does support an in-kernel sampling buffer. In your case,
I believe what you would need is a way to trigger recording of a sample 
at specific locations as opposed to when a counter (or timeout) overflows.

Currently perfmon records samples in the buffer only when a PMU register
generates an interrupt. This happens when a counter overflows, for
instance. Supposing you had a way to trigger recording of a sample
on function entry and exit, then you would get what you want. I think
the trigger could be implemented as a trap. For instance, on x86 we could
possibly use a software interrupt (int 0x..), then catch this and force
perfmon to think there was a PMU interrupt. I am sure there are equivalent
mechanisms on other architectures.

I think this is an interesting idea worth pursuing.

-- 
-Stephane
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Re: [perfmon] New features proposal

Reply via email to