Stephane Eranian wrote:
Hello Will,
On Mon, Oct 29, 2007 at 01:53:50PM -0400, William Cohen wrote:
Hi,
In one of the perfmon2 presentations there is mention "Usage models in
virtual environments" in the current challenges:
http://cscads.rice.edu/workshops/july2007/perf-slides-07/Eranian-Perfmon.pdf
Are there more details on the status/plan of perfmon2 in the presence of
virtualized hardware?
As outlined in my OLS2007 presentation, I have done some prototyping work with
KVM
and VT-x to gauge the level of difficulty. As expected, there were several
issues.
We need to work on resolving those issues at the software and hardware levels.
This work was just looking at KVM, I bet, similar issues exist with Xen.
Do you agree about the usefulness of the *two* usage models I outline in the
presentation?
Today, using something like XenOprofile or just plain Oprofile with KVM you get
the system-wide view but not the PMU virtualizationr. Thus virtualized guests
cannot expose the PMU to guest applications. I think that's bad.
Hi Stephane,
Looking over the OLS2007 slides. You are talking about slide 19 with full
virtualization vs para-virtualization? I haven't read the VT-x/AMD-V documentation.
Exposing the real CPUID is going to make things difficult when migrating between
machines. On Xen a host can be moved from one physical machine to another. What
happens if the physical machines are different? Is there some sane way that
perfmon can indicate that this migration has occurred and made data collection
no longer feasible?
Does the VT-x/AMD-V hardware allow true save and restore of the values? On some
earlier x86 processors there is no way to correctly restore the upper bits of
the performance counters. The stores just sign extended the 32-bit value written
in. Or are the registers just going to be treated as the lower 32-bits of a
64-bit counter. It would have been nice if the performance counters were
implemented as true 64-bit writeable counters.
I would think there are three usage models the last two are subdivision within a
guest OS:
system-wide (everything on the system including vmm like the current xenoprof)
guest-wide (just within the one host os context)
thread-wide (just within one thread/process context)
The vmm would be invoked each time the pmu interrupt handler run due to NMI
handling and pmu writes? How expensive will that be.
How would the system-wide work if there existing guest-wide or thread-wide
profiling working. Would the in kernel pmu allocation be incorporated into the
the VMM? Steal the existing PMU resource from the host os?
-Will
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/