Re: Some Code for Performance Profiling

2010-04-07 Thread Avi Kivity

On 04/07/2010 10:23 PM, Jiaqing Du wrote:



Can your implementation support both simultaneously?
 

What do you mean "simultaneously"? With my implementation, you either
do guest-wide profiling or system-wide profiling. They are achieved
through different patches. Actually, the result of guest-wide
profiling is a subset of system-wide profiling.

   


A guest admin monitors the performance of their guest via a vpmu.  
Meanwhile the host admin monitors the performance of the host (including 
all guests) using the host pmu.  Given that the host pmu and the vpmu 
may select different counters, it is difficult to support both 
simultaneously.



For guest-wide profiling, there are two possible places to save and
restore the related MSRs. One is where the CPU switches between guest
mode and host mode. We call this *CPU-switch*. Profiling with this
enabled reflects how the guest behaves on the physical CPU, plus other
virtualized, not emulated, devices. The other place is where the CPU
switches between the KVM context and others. Here KVM context means
the CPU is executing guest code or KVM code, both kernel space and
user space. We call this *domain-switch*. Profiling with this enabled
discloses how the guest behaves on both the physical CPU and KVM.
(Some emulated operations are really expensive in a virtualized
environment.)

   

Which method do you use?  Or do you support both?
 

I post two patches in my previous email. One is for CPU-switch, and
the other is for domain-switch.

   


I see.  I'm not sure I know which one is better!


Note disclosing host pmu data to the guest is sometimes a security issue.

 

For instance?
   


The standard example is hyperthreading where the memory bus unit is 
shared among two logical processors.  A guest sampling a vcpu on one 
thread can gain information about what is happening on the other - the 
number of bus transactions the other thread has issued.  This can be 
used to establish a communication channel between two guests that 
shouldn't be communicating, or to eavesdrop on another guest.  A similar 
problem happens with multicores.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Some Code for Performance Profiling

2010-04-07 Thread Jiaqing Du
2010/4/5 Avi Kivity :
> On 03/31/2010 07:53 PM, Jiaqing Du wrote:
>>
>> Hi,
>>
>> We have some code about performance profiling in KVM. They are outputs
>> of a school project. Previous discussions in KVM, Perfmon2, and Xen
>> mailing lists helped us a lot. The code are NOT in a good shape and
>> are only used to demonstrated the feasibility of doing performance
>> profiling in KVM. Feel free to use it if you want.
>>
>
> Performance monitoring is an important feature for kvm.  Is there any chance
> you can work at getting it into good shape?

I have been following the discussions about PMU virtualization in the
list for a while. Exporting a proper interface, i.e., guest visible
MSRs and supported events, to the guest across a large number physical
CPUs from different vendors, families, and models is the major
problem. For KVM, currently it also supports almost a dozen different
types of virtual CPUs. I will think about it and try to come up with
something more general.

>
>> We categorize performance profiling in a virtualized environment into
>> two types: *guest-wide profiling* and *system-wide profiling*. For
>> guest-wide profiling, only the guest is profiled. KVM virtualizes the
>> PMU and the user runs a profiler directly in the guest. It requires no
>> modifications to the guest OS and the profiler running in the guest.
>> For system-wide profiling, both KVM and the guest OS are profiled. The
>> results are similar to what XenOprof outputs. In this case, one
>> profiler running in the host and one profiler running in the guest.
>> Still it requires no modifications to the guest and the profiler
>> running in it.
>>
>
> Can your implementation support both simultaneously?

What do you mean "simultaneously"? With my implementation, you either
do guest-wide profiling or system-wide profiling. They are achieved
through different patches. Actually, the result of guest-wide
profiling is a subset of system-wide profiling.

>
>> For guest-wide profiling, there are two possible places to save and
>> restore the related MSRs. One is where the CPU switches between guest
>> mode and host mode. We call this *CPU-switch*. Profiling with this
>> enabled reflects how the guest behaves on the physical CPU, plus other
>> virtualized, not emulated, devices. The other place is where the CPU
>> switches between the KVM context and others. Here KVM context means
>> the CPU is executing guest code or KVM code, both kernel space and
>> user space. We call this *domain-switch*. Profiling with this enabled
>> discloses how the guest behaves on both the physical CPU and KVM.
>> (Some emulated operations are really expensive in a virtualized
>> environment.)
>>
>
> Which method do you use?  Or do you support both?

I post two patches in my previous email. One is for CPU-switch, and
the other is for domain-switch.

>
> Note disclosing host pmu data to the guest is sometimes a security issue.
>

For instance?

> --
> Do not meddle in the internals of kernels, for they are subtle and quick to
> panic.
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Some Code for Performance Profiling

2010-04-05 Thread Avi Kivity

On 03/31/2010 07:53 PM, Jiaqing Du wrote:

Hi,

We have some code about performance profiling in KVM. They are outputs
of a school project. Previous discussions in KVM, Perfmon2, and Xen
mailing lists helped us a lot. The code are NOT in a good shape and
are only used to demonstrated the feasibility of doing performance
profiling in KVM. Feel free to use it if you want.
   


Performance monitoring is an important feature for kvm.  Is there any 
chance you can work at getting it into good shape?



We categorize performance profiling in a virtualized environment into
two types: *guest-wide profiling* and *system-wide profiling*. For
guest-wide profiling, only the guest is profiled. KVM virtualizes the
PMU and the user runs a profiler directly in the guest. It requires no
modifications to the guest OS and the profiler running in the guest.
For system-wide profiling, both KVM and the guest OS are profiled. The
results are similar to what XenOprof outputs. In this case, one
profiler running in the host and one profiler running in the guest.
Still it requires no modifications to the guest and the profiler
running in it.
   


Can your implementation support both simultaneously?


For guest-wide profiling, there are two possible places to save and
restore the related MSRs. One is where the CPU switches between guest
mode and host mode. We call this *CPU-switch*. Profiling with this
enabled reflects how the guest behaves on the physical CPU, plus other
virtualized, not emulated, devices. The other place is where the CPU
switches between the KVM context and others. Here KVM context means
the CPU is executing guest code or KVM code, both kernel space and
user space. We call this *domain-switch*. Profiling with this enabled
discloses how the guest behaves on both the physical CPU and KVM.
(Some emulated operations are really expensive in a virtualized
environment.)
   


Which method do you use?  Or do you support both?

Note disclosing host pmu data to the guest is sometimes a security issue.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Some Code for Performance Profiling

2010-03-31 Thread Jiaqing Du
Hi,

We have some code about performance profiling in KVM. They are outputs
of a school project. Previous discussions in KVM, Perfmon2, and Xen
mailing lists helped us a lot. The code are NOT in a good shape and
are only used to demonstrated the feasibility of doing performance
profiling in KVM. Feel free to use it if you want.

We categorize performance profiling in a virtualized environment into
two types: *guest-wide profiling* and *system-wide profiling*. For
guest-wide profiling, only the guest is profiled. KVM virtualizes the
PMU and the user runs a profiler directly in the guest. It requires no
modifications to the guest OS and the profiler running in the guest.
For system-wide profiling, both KVM and the guest OS are profiled. The
results are similar to what XenOprof outputs. In this case, one
profiler running in the host and one profiler running in the guest.
Still it requires no modifications to the guest and the profiler
running in it.

For guest-wide profiling, there are two possible places to save and
restore the related MSRs. One is where the CPU switches between guest
mode and host mode. We call this *CPU-switch*. Profiling with this
enabled reflects how the guest behaves on the physical CPU, plus other
virtualized, not emulated, devices. The other place is where the CPU
switches between the KVM context and others. Here KVM context means
the CPU is executing guest code or KVM code, both kernel space and
user space. We call this *domain-switch*. Profiling with this enabled
discloses how the guest behaves on both the physical CPU and KVM.
(Some emulated operations are really expensive in a virtualized
environment.)

More details can be found at http://jiaqing.org/download/profiling_kvm.tgz


=Guest-wide profiling with domain-switch, for
Linux-2.6.32==

diff --git a/arch/x86/include/asm/thread_info.h
b/arch/x86/include/asm/thread_info.h
index d27d0a2..b749b5d 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -96,6 +96,7 @@ struct thread_info {
 #define TIF_DS_AREA_MSR26  /* uses 
thread_struct.ds_area_msr */
 #define TIF_LAZY_MMU_UPDATES   27  /* task is updating the mmu lazily */
 #define TIF_SYSCALL_TRACEPOINT 28  /* syscall tracepoint instrumentation */
+#define TIF_VPMU_CTXSW  29  /* KVM thread tag */

 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
@@ -119,6 +120,7 @@ struct thread_info {
 #define _TIF_DS_AREA_MSR   (1 << TIF_DS_AREA_MSR)
 #define _TIF_LAZY_MMU_UPDATES  (1 << TIF_LAZY_MMU_UPDATES)
 #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT)
+#define _TIF_VPMU_CTXSW (1 << TIF_VPMU_CTXSW)

 /* work to do in syscall_trace_enter() */
 #define _TIF_WORK_SYSCALL_ENTRY\
@@ -146,8 +148,9 @@ struct thread_info {

 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW
\
-   (_TIF_IO_BITMAP|_TIF_DEBUGCTLMSR|_TIF_DS_AREA_MSR|_TIF_NOTSC)
-
+   (_TIF_IO_BITMAP|_TIF_DEBUGCTLMSR|_TIF_DS_AREA_MSR|_TIF_NOTSC|   \
+ _TIF_VPMU_CTXSW)
+
 #define _TIF_WORK_CTXSW_PREV _TIF_WORK_CTXSW
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW|_TIF_DEBUG)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 5284cd2..d5269d8 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -178,6 +178,53 @@ int set_tsc_mode(unsigned int val)
return 0;
 }

+static const u32 vmx_pmu_msr_index[] = {
+  MSR_P6_EVNTSEL0, MSR_P6_EVNTSEL1, MSR_P6_PERFCTR0, MSR_P6_PERFCTR1,
+};
+#define NR_VMX_PMU_MSR ARRAY_SIZE(vmx_pmu_msr_index)
+static u64 vpmu_msr_list[NR_VMX_PMU_MSR];
+
+static void vpmu_load_msrs(u64 *msr_list)
+{
+u64 *p = msr_list;
+int i;
+
+   for (i = 0; i < NR_VMX_PMU_MSR; ++i) {
+   wrmsrl(vmx_pmu_msr_index[i], *p);
+   p++;
+   }
+}
+
+static void vpmu_save_msrs(u64 *msr_list)
+{
+u64 *p = msr_list;
+int i;
+
+   for (i = 0; i < NR_VMX_PMU_MSR; ++i) {
+   rdmsrl(vmx_pmu_msr_index[i], *p);
+   p++;
+   }
+}
+
+#define P6_EVENTSEL0_ENABLE (1 << 22)
+static void enable_perf(void)
+{
+u64 val;
+
+rdmsrl(MSR_P6_EVNTSEL0, val);
+val |= P6_EVENTSEL0_ENABLE;
+wrmsrl(MSR_P6_EVNTSEL0, val);
+}
+
+static void disable_perf(void)
+{
+u64 val;
+
+rdmsrl(MSR_P6_EVNTSEL0, val);
+val &= ~P6_EVENTSEL0_ENABLE;
+wrmsrl(MSR_P6_EVNTSEL0, val);
+}
+
 void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
  struct tss_struct *tss)
 {
@@ -186,6 +233,21 @@ void __switch_to_xtra(struct task_struct *prev_p,
struct task_struct *next_p,
prev = &prev_p->thread;
next = &next_p->thread;

+if (test_tsk_thread_flag(prev_p, TIF_VPMU_CTXSW) &&
+test_tsk_thread_flag(next_p, TIF_VPMU_CTXSW)) {
+/* do nothing, still in KVM context */
+} else {
+if