Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization

2015-02-26 Thread Kevin.Mayer


 -Ursprüngliche Nachricht-
 Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
 Gesendet: Donnerstag, 26. Februar 2015 17:35
 An: Dietmar Hahn; xen-devel@lists.xen.org
 Cc: Mayer, Kevin
 Betreff: Re: [Xen-devel] Branch Trace Storage for guests and
 VPMUinitialization
 
 On 02/26/2015 03:56 AM, Dietmar Hahn wrote:
  Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky:
  On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote:
  -Ursprüngliche Nachricht-
  Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
  Gesendet: Dienstag, 24. Februar 2015 18:13
  An: Mayer, Kevin; xen-devel@lists.xen.org
  Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU
  initialization
 
  On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote:
  Hi guys
 
  I`m trying to set up the BTS so that I can log the branches taken
  in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7
  Sandy Bridge.
 
  I added the vpmu=bts boot parameter to my grub2 configuration and
  extended the libxl,libxc,domctl,… with an own command so that I
  can trigger the activation of the BTS whenever I want.
 
  I am not sure why you are doing all these changes to Xen code. BTS
  is supposed to be managed from the guest. For example, a Fedora
 HVM
  guest will produce this:
 
  [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e
  branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to
  write data ] [ perf record: Captured and wrote 0.704 MB perf.data
  (~30756 samples) ]
  [root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f
  ip,addr,sym,dso,symoff --show-kernel-path
  8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
  328c001590 [unknown] (/proc/kcore)
  8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
  328c001590 [unknown] ([unknown])
328c001593 [unknown] ([unknown]) =   328c004b70 [unknown]
  ([unknown])
  ...
 
  I want to be able to log the taken branches (of the guest) without the
 need to modify the guest at all.
  This means I have to do all the logic in the hypervisor, or am I wrong?
  In that case, yes. But then you have to make sure that at least
 * you don't load guest's VPMU (or, at least, BTS-related
  registers) on context switch
  But you need to modify PMU registers when switching to/from the guest
  context to get PMU running.
 
 
 
 I was thinking that all BTS stuff can be controlled from dom0 and so we can
 use dom0's version of these registers. I didn't realize that DS_AREA would
 have to be accessed in guest's address space (and that DEBUGCTL is loaded
 from VMCS).
 
 Which is what I think I said in response to this message (which didn't show up
 on the list because Kevin accidentally dropped xen-devel).
 
 -boris
 
Terribly sorry about that...

So the VPMU doesn’t get loaded when there is a VMENTER?
I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with 
modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr 
of course since the build in ones use the current-vcpu which would be the 
dom0-vcpu)
and as soon as there is a context switch to domU the vpmu gets loaded and the 
guest starts logging.
If the described behavior is correct the only problem I can see is with 
allocating memory in dom0 in a way that the guest can access it.
But if I got it wrong please explain how the vpmu really works.

Cheers

Kevin


 
 
  I didn't think of using the VPMU stuff with modifying the context from
  outside the guest.
 
 * You don't send the interrupt to the guest (meaning that you will
  need to somehow inform dom0 of the BTS interrupt)
 
  and probably more.
 
  Essentially, you want dom0 to profile the guest. I have been working
  on patches that would allow that but they are still under review.
 
 
  In this command I do the following:
 
  I set up the memory region for the BTS Buffer and the DS Buffer
  Management Area using xzalloc_bytes
 
  I don't think you should be allocating BTS buffers in the
  hypervisor, they are in guest's memory.
  I agree. As I said I think this is where my main problem is at the moment.
  Is there any way I can allocate memory in the hypervisor in a way the
 guest can access it?
  I am not sure this is what you want since you seem to *not* want the
  guest to process the samples, right?
 
  But yes, you can. E.g. something like what map_vcpu_info() does. (I
  have no idea how you'd do this from Windows.)
  The DS buffer has to be mapped within the guests address space so the
  CPU running in guest context can access this area. Otherwise you get
  this triple fault.
  So I would think you need a mixture of writing some stuff in Windows
  and patching the hypervisor.
 
  Dietmar.
 
 
  Of course the guest must not be able to use this memory in its normal
 operations but just for BTS.
  Is this even possible? I am rather confused at the moment. :-D
 
  Then I write the pointer to the BTS Buffer into the DS Buffer
  

Re: [Xen-devel] Branch Trace Storage for guests andVPMUinitialization

2015-02-26 Thread Boris Ostrovsky

On 02/26/2015 12:57 PM, kevin.ma...@gdata.de wrote:




-Ursprüngliche Nachricht-
Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
Gesendet: Donnerstag, 26. Februar 2015 17:35
An: Dietmar Hahn; xen-devel@lists.xen.org
Cc: Mayer, Kevin
Betreff: Re: [Xen-devel] Branch Trace Storage for guests and
VPMUinitialization

On 02/26/2015 03:56 AM, Dietmar Hahn wrote:

Am Mittwoch 25 Februar 2015, 11:31:31 schrieb Boris Ostrovsky:

On 02/25/2015 10:12 AM, kevin.ma...@gdata.de wrote:

-Ursprüngliche Nachricht-
Von: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
Gesendet: Dienstag, 24. Februar 2015 18:13
An: Mayer, Kevin; xen-devel@lists.xen.org
Betreff: Re: [Xen-devel] Branch Trace Storage for guests and VPMU
initialization

On 02/24/2015 10:27 AM, kevin.ma...@gdata.de wrote:

Hi guys

I`m trying to set up the BTS so that I can log the branches taken
in the guest using Xen 4.4.1 with a WinXP SP3 guest on a Core i7
Sandy Bridge.

I added the vpmu=bts boot parameter to my grub2 configuration and
extended the libxl,libxc,domctl,… with an own command so that I
can trigger the activation of the BTS whenever I want.


I am not sure why you are doing all these changes to Xen code. BTS
is supposed to be managed from the guest. For example, a Fedora

HVM

guest will produce this:

[root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf record -e
branches:u -c 1 -d sleep 1 [ perf record: Woken up 3838 times to
write data ] [ perf record: Captured and wrote 0.704 MB perf.data
(~30756 samples) ]
[root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# perf script -f
ip,addr,sym,dso,symoff --show-kernel-path
 8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
328c001590 [unknown] (/proc/kcore)
 8167c347 native_irq_return_iret+0x0 (/proc/kcore) =
328c001590 [unknown] ([unknown])
   328c001593 [unknown] ([unknown]) =   328c004b70 [unknown]
([unknown])
...


I want to be able to log the taken branches (of the guest) without the

need to modify the guest at all.

This means I have to do all the logic in the hypervisor, or am I wrong?

In that case, yes. But then you have to make sure that at least
* you don't load guest's VPMU (or, at least, BTS-related
registers) on context switch

But you need to modify PMU registers when switching to/from the guest
context to get PMU running.




I was thinking that all BTS stuff can be controlled from dom0 and so we can
use dom0's version of these registers. I didn't realize that DS_AREA would
have to be accessed in guest's address space (and that DEBUGCTL is loaded
from VMCS).

Which is what I think I said in response to this message (which didn't show up
on the list because Kevin accidentally dropped xen-devel).

-boris


Terribly sorry about that...

So the VPMU doesn’t get loaded when there is a VMENTER?



Not exactly. For BTS, DEBUGCTL register, which lives in VMCS, does get 
loaded. But not DS_AREA --- it gets loaded by SW during 
context_switch()-vpmu_load().


(As for general VPMU registers such as counters --- they are also loaded 
during context_switch(). But I don't think you care about those. From 
what little I know about BTS, DEBUGCTL and DS_AREA are the only two 
registers you are interested in)



I thought I could set the domU-vcpu-vpmu to enable BTS while in dom0 (with 
modified versions of msr_write_intercept, vpmu_do_wrmsr and core2_vpmu_do_wrmsr of 
course since the build in ones use the current-vcpu which would be the dom0-vcpu)
and as soon as there is a context switch to domU the vpmu gets loaded and the 
guest starts logging.


And it should work, provided that DS_AREA is set up correctly.


If the described behavior is correct the only problem I can see is with 
allocating memory in dom0 in a way that the guest can access it.


This sounds right. All you have to do now is implementation details ;-)

-boris



But if I got it wrong please explain how the vpmu really works.

Cheers

Kevin






I didn't think of using the VPMU stuff with modifying the context from
outside the guest.


* You don't send the interrupt to the guest (meaning that you will
need to somehow inform dom0 of the BTS interrupt)

and probably more.

Essentially, you want dom0 to profile the guest. I have been working
on patches that would allow that but they are still under review.



In this command I do the following:

I set up the memory region for the BTS Buffer and the DS Buffer
Management Area using xzalloc_bytes


I don't think you should be allocating BTS buffers in the
hypervisor, they are in guest's memory.

I agree. As I said I think this is where my main problem is at the moment.
Is there any way I can allocate memory in the hypervisor in a way the

guest can access it?

I am not sure this is what you want since you seem to *not* want the
guest to process the samples, right?

But yes, you can. E.g. something like what map_vcpu_info() does. (I
have no idea how you'd do this from Windows.)

The DS buffer has to be mapped