Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization
-Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Donnerstag, 26. Februar 2015 17:35 An: Dietmar Hahn; xen-devel@lists.xen.org Cc: Mayer, Kevin Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMUinitialization On 02/26/2015 03:56 AM, Dietmar Hahn wrote: Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky: On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote: -Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Dienstag, 24. Februar 2015 18:13 An: Mayer, Kevin; xen-devel@lists.xen.org Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU initialization On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote: Hi guys I`m trying to set up the BTS so that I can log the branches taken in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge. I added the vpmu=bts boot parameter to my grub2 configuration and extended the libxl,libxc,domctl,… with an own command so that I can trigger the activation of the BTS whenever I want. I am not sure why you are doing all these changes to Xen code. BTS is supposed to be managed from the guest. For example, a Fedora HVM guest will produce this: [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to write data ] [ perf record: Captured and wrote 0.704 MB perf.data (~30756 samples) ] [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f ip,addr,sym,dso,symoff --show-kernel-path 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] (/proc/kcore) 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] ([unknown]) 328c001593 [unknown] ([unknown]) = 328c004b70 [unknown] ([unknown]) ... I want to be able to log the taken branches (of the guest) without the need to modify the guest at all. This means I have to do all the logic in the hypervisor, or am I wrong? In that case, yes. But then you have to make sure that at least * you don't load guest's VPMU (or, at least, BTS-related registers) on context switch But you need to modify PMU registers when switching to/from the guest context to get PMU running. I was thinking that all BTS stuff can be controlled from dom0 and so we can use dom0's version of these registers. I didn't realize that DS_AREA would have to be accessed in guest's address space (and that DEBUGCTL is loaded from VMCS). Which is what I think I said in response to this message (which didn't show up on the list because Kevin accidentally dropped xen-devel). -boris Terribly sorry about that... So the VPMU doesn’t get loaded when there is a VMENTER? I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr of course since the build in ones use the current-vcpu which would be the dom0-vcpu) and as soon as there is a context switch to domU the vpmu gets loaded and the guest starts logging. If the described behavior is correct the only problem I can see is with allocating memory in dom0 in a way that the guest can access it. But if I got it wrong please explain how the vpmu really works. Cheers Kevin I didn't think of using the VPMU stuff with modifying the context from outside the guest. * You don't send the interrupt to the guest (meaning that you will need to somehow inform dom0 of the BTS interrupt) and probably more. Essentially, you want dom0 to profile the guest. I have been working on patches that would allow that but they are still under review. In this command I do the following: I set up the memory region for the BTS Buffer and the DS Buffer Management Area using xzalloc_bytes I don't think you should be allocating BTS buffers in the hypervisor, they are in guest's memory. I agree. As I said I think this is where my main problem is at the moment. Is there any way I can allocate memory in the hypervisor in a way the guest can access it? I am not sure this is what you want since you seem to *not* want the guest to process the samples, right? But yes, you can. E.g. something like what map_vcpu_info() does. (I have no idea how you'd do this from Windows.) The DS buffer has to be mapped within the guests address space so the CPU running in guest context can access this area. Otherwise you get this triple fault. So I would think you need a mixture of writing some stuff in Windows and patching the hypervisor. Dietmar. Of course the guest must not be able to use this memory in its normal operations but just for BTS. Is this even possible? I am rather confused at the moment. :-D Then I write the pointer to the BTS Buffer into the DS Buffer
Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization
On 02/26/2015 12:57 PM, kevin.ma...@gdata.de wrote: -Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Donnerstag, 26. Februar 2015 17:35 An: Dietmar Hahn; xen-devel@lists.xen.org Cc: Mayer, Kevin Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMUinitialization On 02/26/2015 03:56 AM, Dietmar Hahn wrote: Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky: On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote: -Ursprüngliche Nachricht- Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com] Gesendet: Dienstag, 24. Februar 2015 18:13 An: Mayer, Kevin; xen-devel@lists.xen.org Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU initialization On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote: Hi guys I`m trying to set up the BTS so that I can log the branches taken in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7 Sandy Bridge. I added the vpmu=bts boot parameter to my grub2 configuration and extended the libxl,libxc,domctl,… with an own command so that I can trigger the activation of the BTS whenever I want. I am not sure why you are doing all these changes to Xen code. BTS is supposed to be managed from the guest. For example, a Fedora HVM guest will produce this: [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to write data ] [ perf record: Captured and wrote 0.704 MB perf.data (~30756 samples) ] [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f ip,addr,sym,dso,symoff --show-kernel-path 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] (/proc/kcore) 8167c347 native_irq_return_iret+0x0 (/proc/kcore) = 328c001590 [unknown] ([unknown]) 328c001593 [unknown] ([unknown]) = 328c004b70 [unknown] ([unknown]) ... I want to be able to log the taken branches (of the guest) without the need to modify the guest at all. This means I have to do all the logic in the hypervisor, or am I wrong? In that case, yes. But then you have to make sure that at least * you don't load guest's VPMU (or, at least, BTS-related registers) on context switch But you need to modify PMU registers when switching to/from the guest context to get PMU running. I was thinking that all BTS stuff can be controlled from dom0 and so we can use dom0's version of these registers. I didn't realize that DS_AREA would have to be accessed in guest's address space (and that DEBUGCTL is loaded from VMCS). Which is what I think I said in response to this message (which didn't show up on the list because Kevin accidentally dropped xen-devel). -boris Terribly sorry about that... So the VPMU doesn’t get loaded when there is a VMENTER? Not exactly. For BTS, DEBUGCTL register, which lives in VMCS, does get loaded. But not DS_AREA --- it gets loaded by SW during context_switch()-vpmu_load(). (As for general VPMU registers such as counters --- they are also loaded during context_switch(). But I don't think you care about those. From what little I know about BTS, DEBUGCTL and DS_AREA are the only two registers you are interested in) I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr of course since the build in ones use the current-vcpu which would be the dom0-vcpu) and as soon as there is a context switch to domU the vpmu gets loaded and the guest starts logging. And it should work, provided that DS_AREA is set up correctly. If the described behavior is correct the only problem I can see is with allocating memory in dom0 in a way that the guest can access it. This sounds right. All you have to do now is implementation details ;-) -boris But if I got it wrong please explain how the vpmu really works. Cheers Kevin I didn't think of using the VPMU stuff with modifying the context from outside the guest. * You don't send the interrupt to the guest (meaning that you will need to somehow inform dom0 of the BTS interrupt) and probably more. Essentially, you want dom0 to profile the guest. I have been working on patches that would allow that but they are still under review. In this command I do the following: I set up the memory region for the BTS Buffer and the DS Buffer Management Area using xzalloc_bytes I don't think you should be allocating BTS buffers in the hypervisor, they are in guest's memory. I agree. As I said I think this is where my main problem is at the moment. Is there any way I can allocate memory in the hypervisor in a way the guest can access it? I am not sure this is what you want since you seem to *not* want the guest to process the samples, right? But yes, you can. E.g. something like what map_vcpu_info() does. (I have no idea how you'd do this from Windows.) The DS buffer has to be mapped