Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-11-03 Thread Konrad Rzeszutek Wilk
.. snip..
>>
>> > (XEN) Failed vm entry (exit reason 0x8021) caused by invalid guest 
>> > state (4).
>>
>> 4 means invalid VMCS link pointer - interesting.
>>
>
> Hey Jan,
>
> I hadn't been able to look at this for a quite while. A couple of folks have
> showed interest in looking at this, CC-ing them.
>

and Matt (CC-ed) had been able to debug this a bit further as well:

" I tracked this down to incorrect Xen emulation of  VMWRITE, VMPTRLD,
VMLAUNCH, and VMRESUME, in which Xen is failing the
operation if the provided address can't be mapped. A L1 VMM should be
allowed to write whatever garbage it wants into VMCS. The value may
not even be used depending on other control fields.

Xen also shouldn't be setting RFLAGS.CF (VMfailInvalid) for any
condition other than an invalid VMCS-link pointer (it was setting
RFLAGS.CF when it couldn't map the bitmap pages)."

And in retrospective it makes sense that VMWare writes garbage in
VMCS - it is probably using the binary translation part at that point.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-11-02 Thread Konrad Rzeszutek Wilk
On Fri, Feb 05, 2016 at 03:33:44AM -0700, Jan Beulich wrote:
> >>> On 04.02.16 at 19:36,  wrote:
> > (XEN) nvmx_handle_vmwrite 1: IO_BITMAP_A(2000)[0=]
> > (XEN) nvmx_handle_vmwrite 0: IO_BITMAP_A(2000)[0=]
> > (XEN) nvmx_handle_vmwrite 1: IO_BITMAP_B(2002)[0=]
> > (XEN) nvmx_handle_vmwrite 2: IO_BITMAP_A(2000)[0=]
> > (XEN) nvmx_handle_vmwrite 1: 
> > VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
> > (XEN) nvmx_handle_vmwrite 2: IO_BITMAP_B(2002)[0=]
> > (XEN) nvmx_handle_vmwrite 1: (2006)[0=]
> > (XEN) nvmx_handle_vmwrite 2: 
> > VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
> > (XEN) nvmx_handle_vmwrite 1: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
> > (XEN) nvmx_handle_vmwrite 3: IO_BITMAP_A(2000)[0=]
> > (XEN) nvmx_handle_vmwrite 3: IO_BITMAP_B(2002)[0=]
> > (XEN) nvmx_handle_vmwrite 2: MSR_BITMAP(2004)[0=]
> > (XEN) nvmx_handle_vmwrite 1: MSR_BITMAP(2004)[0=]
> > (XEN) nvmx_handle_vmwrite 0: MSR_BITMAP(2004)[0=]
> > (XEN) nvmx_handle_vmwrite 3: (2006)[0=]
> > (XEN) nvmx_handle_vmwrite 3: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
> > (XEN) nvmx_handle_vmwrite 3: MSR_BITMAP(2004)[0=]
> 
> So there's a whole lot of "interesting" writes of all ones, and indeed
> VIRTUAL_APIC_PAGE_ADDR is among them, and the code doesn't
> handle that case (nor the equivalent for APIC_ACCESS_ADDR).
> What's odd though is that the writes are for vCPU 1 and 2, while
> the crash is on vCPU 3 (it would of course help if the guest had as
> few vCPU-s as possible without making the issue disappear). While
> you have circumvented the ASSERT() you've originally hit, the log
> messages you've added there don't appear anywhere, which is
> clearly confusing, so I wonder what other unintended effects your
> debugging code has (there's clearly an uninitialized variable issue
> in your additions to vmx_vmexit_handler(), but that shouldn't
> matter here, albeit it should have cause build failure, making me
> suspect the patch to be stale).
> 
> Oddly enough the various bitmap field VMWRITEs above should all
> fail, yet the guest appears to recover from (ignore?) these
> failures. (From all I can tell we're prone to NULL dereferences due
> to that at least in _shadow_io_bitmap().)
> 
> > (XEN) Failed vm entry (exit reason 0x8021) caused by invalid guest 
> > state (4).
> 
> 4 means invalid VMCS link pointer - interesting.
>

Hey Jan,

I hadn't been able to look at this for a quite while. A couple of folks have
showed interest in looking at this, CC-ing them.

For folks that are new, it may also be worth looking at:
http://www.gossamer-threads.com/lists/xen/devel/413285?page=last
which has the full thread.

Here are also the instructions on how to reproduce it:
(This Xen 4.7 'staging-4.7')

2) Download VMWare ESX and install it:
[root@localhost ~]# more vmware.xm
memory=8192
maxvcpus = 4
name = "VMWARE"
vif = [ 'mac=00:0f:4b:00:00:85,bridge=switch,model=e1000' ]
disk= ['phy:/dev/nested_guests/VMWare_ESX,hda,w']
#,'file:/mnt/iso/VMware-VMvisor-Installer-6.0.0.update02-3620759.x86_64.iso,hdc:cdrom,r']
#boot="dn"
kernel = "/usr/lib/xen/boot/hvmloader"
builder='hvm'
serial='pty'
vcpus = 4
vnc=1
vnclisten="0.0.0.0"
usb=1
nestedhvm=1

3) Let the guest be installed - once it has rebooted.
4) Enable SSH on the VMWare ESX,  Press F2 on guest console, login, select
'Troubleshooting Options', Enter 'Enable ESXi Shell' and 'Enable SSH'
5). Create a guest using VMWare ESXi Client (you need to use Windows for
that). I picked the simplest option and went ahead with FreeBSD (you can also 
do Linux).
6) To download the guest in VMWare you can SSH in the ESXi:
#cd /vmfs/volumes/datastore1
#wget 
http://ftp.freebsd.org/pub/FreeBSD/releases/ISO-IMAGES/11.0/FreeBSD-11.0-RELEASE-amd64-disc1.iso

7) In the VMWare ESXI client hook up the 'CD' to the ISO.

8). Hit Start and get greeted with:  You are running VMware ESX through an
incompatible hypervisor. You cannot power on a virtual machine until this
hypervisor is disabled". Go to
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2108724

which is just editing the .vmx file with an attribute, so login back in the
VMWare ESXi and:
[root@g-osstest:~] vi `find / -name *.vmx`

and add:
vmx.allowNested="TRUE"  

9) Start the guest up again in VMWare and be greeted with that splash screen. 

 (XEN) [ Xen-4.7.1-pre  x86_64  debug=n  Not tainted ]
(XEN) CPU:2
(XEN) RIP:e008:[] put_page+0x1/0xd0
(XEN) RFLAGS: 00010202   CONTEXT: hypervisor (d1v0)
(XEN) rax: 2012   rbx: 84802eddb000   rcx: 557f
(XEN) rdx: 57f0   rsi: 000ff800   rdi: 
(XEN) rbp:    rsp: 8488bf06fe98   r8:  
(XEN) r9:     r10: 0014   

Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-16 Thread Tian, Kevin
> From: Tian, Kevin
> Sent: Thursday, February 04, 2016 1:52 PM
> 
> > From: Jan Beulich [mailto:jbeul...@suse.com]
> > Sent: Wednesday, February 03, 2016 5:35 PM
> > > (XEN) nvmx_handle_vmwrite: 0
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 0
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 0
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 2008
> > > (XEN) nvmx_handle_vmwrite: 800
> > > (XEN) nvmx_handle_vmwrite: 804
> > > (XEN) nvmx_handle_vmwrite: 806
> > > (XEN) nvmx_handle_vmwrite: 80a
> > > (XEN) nvmx_handle_vmwrite: 80e
> > > (XEN) nvmx_update_virtual_apic_address: vCPU1 0x(vAPIC)
> 0x0(APIC),
> > 0x0(TPR) ctrl=b5b9effe sec=0
> >
> > Assuming the field starts out as other than all ones, could you check
> > its value on each of the intercepted VMWRITEs, to at least narrow
> > when it changes.
> >
> > Kevin, Jun - are there any cases where the hardware would alter
> > this field's value? Like during some guest side LAPIC manipulations?
> > (The same monitoring as suggested during VMWRITEs could of
> > course also be added to LAPIC accesses visible to the hypervisor,
> > but I guess there won't be too many of those.)
> >
> 
> No such case in my knowledge. But let me confirm with hardware team.
> 

Confirmed no such case.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-05 Thread Jan Beulich
>>> On 04.02.16 at 19:36,  wrote:
> (XEN) nvmx_handle_vmwrite 1: IO_BITMAP_A(2000)[0=]
> (XEN) nvmx_handle_vmwrite 0: IO_BITMAP_A(2000)[0=]
> (XEN) nvmx_handle_vmwrite 1: IO_BITMAP_B(2002)[0=]
> (XEN) nvmx_handle_vmwrite 2: IO_BITMAP_A(2000)[0=]
> (XEN) nvmx_handle_vmwrite 1: VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
> (XEN) nvmx_handle_vmwrite 2: IO_BITMAP_B(2002)[0=]
> (XEN) nvmx_handle_vmwrite 1: (2006)[0=]
> (XEN) nvmx_handle_vmwrite 2: VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
> (XEN) nvmx_handle_vmwrite 1: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
> (XEN) nvmx_handle_vmwrite 3: IO_BITMAP_A(2000)[0=]
> (XEN) nvmx_handle_vmwrite 3: IO_BITMAP_B(2002)[0=]
> (XEN) nvmx_handle_vmwrite 2: MSR_BITMAP(2004)[0=]
> (XEN) nvmx_handle_vmwrite 1: MSR_BITMAP(2004)[0=]
> (XEN) nvmx_handle_vmwrite 0: MSR_BITMAP(2004)[0=]
> (XEN) nvmx_handle_vmwrite 3: (2006)[0=]
> (XEN) nvmx_handle_vmwrite 3: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
> (XEN) nvmx_handle_vmwrite 3: MSR_BITMAP(2004)[0=]

So there's a whole lot of "interesting" writes of all ones, and indeed
VIRTUAL_APIC_PAGE_ADDR is among them, and the code doesn't
handle that case (nor the equivalent for APIC_ACCESS_ADDR).
What's odd though is that the writes are for vCPU 1 and 2, while
the crash is on vCPU 3 (it would of course help if the guest had as
few vCPU-s as possible without making the issue disappear). While
you have circumvented the ASSERT() you've originally hit, the log
messages you've added there don't appear anywhere, which is
clearly confusing, so I wonder what other unintended effects your
debugging code has (there's clearly an uninitialized variable issue
in your additions to vmx_vmexit_handler(), but that shouldn't
matter here, albeit it should have cause build failure, making me
suspect the patch to be stale).

Oddly enough the various bitmap field VMWRITEs above should all
fail, yet the guest appears to recover from (ignore?) these
failures. (From all I can tell we're prone to NULL dereferences due
to that at least in _shadow_io_bitmap().)

> (XEN) Failed vm entry (exit reason 0x8021) caused by invalid guest state 
> (4).

4 means invalid VMCS link pointer - interesting.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-04 Thread Konrad Rzeszutek Wilk
On Wed, Feb 03, 2016 at 10:07:27AM -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Feb 03, 2016 at 02:34:47AM -0700, Jan Beulich wrote:
> > >>> On 02.02.16 at 23:05,  wrote:
> > > This is getting more and more bizzare.
> > > 
> > > I realized that this machine has VMCS shadowing so Xen does not trap on
> > > any vmwrite or vmread. Unless I update the VMCS shadowing bitmap - which
> > > I did for vmwrite and vmread to get a better view of this. It never
> > > traps on VIRTUAL_APIC_PAGE_ADDR accesses. It does trap on: 
> > > VIRTUAL_PROCESSOR_ID,
> > > VM_EXIT_MSR_LOAD_ADDR and GUEST_[ES,DS,FS,GS,TR]_SELECTORS.
> > > 
> > > (It may also trap on IO_BITMAP_A,B but I didn't print that out).
> > > 
> > > To confirm that the VMCS that will be given to the L2 guest is correct
> > > I added some printking of some states that ought to be pretty OK such
> > > as HOST_RIP or HOST_RSP - which are all 0!
> > 
> > But did you also check what the field of interest starts out as?
> 
> I will do that.

Attached is the patch against staging (I had used 4.6 before as the only change
between those two was the dynamic mapping/unmapping of the vmread/vmwrite 
bitmap).

(d1) 
(d1) drive 0x000f6270: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 
s=524288000
(d1) 
(d1) Space available for UMB: cb800-ed000, f5d30-f6270
(d1) Returned 258048 bytes of ZoneHigh
(d1) e820 map has 7 items:
(d1)   0:  - 0009fc00 = 1 RAM
(d1)   1: 0009fc00 - 000a = 2 RESERVED
(d1)   2: 000f - 0010 = 2 RESERVED
(d1)   3: 0010 - e000 = 1 RAM
(d1)   4: e000 - f000 = 2 RESERVED
(d1)   5: fc00 - 0001 = 2 RESERVED
(d1)   6: 0001 - 00020f80 = 1 RAM
(d1) enter handle_19:
(d1)   NULL
(d1) Booting from Hard Disk...
(d1) Booting from :7c00
(XEN) stdvga.c:178:d1v0 leaving stdvga mode
(XEN) stdvga.c:173:d1v0 entering stdvga mode
(XEN) nvmx_handle_vmwrite 1: IO_BITMAP_A(2000)[0=]
(XEN) nvmx_handle_vmwrite 0: IO_BITMAP_A(2000)[0=]
(XEN) nvmx_handle_vmwrite 1: IO_BITMAP_B(2002)[0=]
(XEN) nvmx_handle_vmwrite 2: IO_BITMAP_A(2000)[0=]
(XEN) nvmx_handle_vmwrite 1: VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
(XEN) nvmx_handle_vmwrite 2: IO_BITMAP_B(2002)[0=]
(XEN) nvmx_handle_vmwrite 1: (2006)[0=]
(XEN) nvmx_handle_vmwrite 2: VIRTUAL_APIC_PAGE_ADDR(2012)[0=]
(XEN) nvmx_handle_vmwrite 1: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
(XEN) nvmx_handle_vmwrite 3: IO_BITMAP_A(2000)[0=]
(XEN) nvmx_handle_vmwrite 3: IO_BITMAP_B(2002)[0=]
(XEN) nvmx_handle_vmwrite 2: MSR_BITMAP(2004)[0=]
(XEN) nvmx_handle_vmwrite 1: MSR_BITMAP(2004)[0=]
(XEN) nvmx_handle_vmwrite 0: MSR_BITMAP(2004)[0=]
(XEN) nvmx_handle_vmwrite 3: (2006)[0=]
(XEN) nvmx_handle_vmwrite 3: VM_EXIT_MSR_LOAD_ADDR(2008)[0=]
(XEN) nvmx_handle_vmwrite 3: MSR_BITMAP(2004)[0=]
(XEN) nvmx_handle_vmwrite 1: VIRTUAL_PROCESSOR_ID(0)[0=9]
(XEN) nvmx_handle_vmwrite 0: VIRTUAL_PROCESSOR_ID(0)[0=9]
(XEN) nvmx_handle_vmwrite 1: MSR_BITMAP(2004)[=1367ed000]
(XEN) nvmx_handle_vmwrite 3: VIRTUAL_PROCESSOR_ID(0)[0=9]
(XEN) nvmx_handle_vmwrite 0: MSR_BITMAP(2004)[=1367ed000]
(XEN) nvmx_handle_vmwrite 1: 
VM_EXIT_MSR_LOAD_ADDR(2008)[=135639f40]
(XEN) nvmx_handle_vmwrite 0: 
VM_EXIT_MSR_LOAD_ADDR(2008)[=135666f40]
(XEN) nvmx_handle_vmwrite 2: VIRTUAL_PROCESSOR_ID(0)[0=9]
(XEN) nvmx_handle_vmwrite 3: MSR_BITMAP(2004)[=1367ed000]
(XEN) nvmx_handle_vmwrite 3: 
VM_EXIT_MSR_LOAD_ADDR(2008)[=135693f40]
(XEN) nvmx_handle_vmwrite 2: MSR_BITMAP(2004)[=1367ed000]
(XEN) nvmx_handle_vmwrite 2: 
VM_EXIT_MSR_LOAD_ADDR(2008)[=135701f40]
(XEN) nvmx_handle_vmwrite 3: VM_EXIT_MSR_LOAD_ADDR(2008)[135639f40=13763cf40]
(XEN) nvmx_handle_vmwrite 1: VM_EXIT_MSR_LOAD_ADDR(2008)[135701f40=137a3cf40]
(XEN) nvmx_handle_vmwrite 0: VM_EXIT_MSR_LOAD_ADDR(2008)[135693f40=13783cf40]
(XEN) nvmx_handle_vmwrite 2: VM_EXIT_MSR_LOAD_ADDR(2008)[135666f40=137c3cf40]
(XEN) nvmx_handle_vmwrite 3: (800)[0=0]
(XEN) nvmx_handle_vmwrite 3: (804)[0=0]
(XEN) nvmx_handle_vmwrite 3: (806)[0=0]
(XEN) nvmx_handle_vmwrite 3: (80a)[0=0]
(XEN) nvmx_handle_vmwrite 3: (80e)[0=0]
(XEN) vvmx.c:2566:d1v3 Unknown nested vmexit reason 8021.
(XEN) Failed vm entry (exit reason 0x8021) caused by invalid guest state 
(4).
(XEN) * VMCS Area **
(XEN) *** Guest State ***
(XEN) CR0: actual=0x0030, shadow=0x, 
gh_mask=
(XEN) CR4: actual=0x2050, shadow=0x, 
gh_mask=
(XEN) CR3 = 0x80c06000
(XEN) RSP = 0x (0x)  RIP = 0x

Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-03 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Wednesday, February 03, 2016 5:35 PM
> > (XEN) nvmx_handle_vmwrite: 0
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 0
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 0
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 2008
> > (XEN) nvmx_handle_vmwrite: 800
> > (XEN) nvmx_handle_vmwrite: 804
> > (XEN) nvmx_handle_vmwrite: 806
> > (XEN) nvmx_handle_vmwrite: 80a
> > (XEN) nvmx_handle_vmwrite: 80e
> > (XEN) nvmx_update_virtual_apic_address: vCPU1 0x(vAPIC) 
> > 0x0(APIC),
> 0x0(TPR) ctrl=b5b9effe sec=0
> 
> Assuming the field starts out as other than all ones, could you check
> its value on each of the intercepted VMWRITEs, to at least narrow
> when it changes.
> 
> Kevin, Jun - are there any cases where the hardware would alter
> this field's value? Like during some guest side LAPIC manipulations?
> (The same monitoring as suggested during VMWRITEs could of
> course also be added to LAPIC accesses visible to the hypervisor,
> but I guess there won't be too many of those.)
> 

No such case in my knowledge. But let me confirm with hardware team.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-03 Thread Konrad Rzeszutek Wilk
On Wed, Feb 03, 2016 at 02:34:47AM -0700, Jan Beulich wrote:
> >>> On 02.02.16 at 23:05,  wrote:
> > This is getting more and more bizzare.
> > 
> > I realized that this machine has VMCS shadowing so Xen does not trap on
> > any vmwrite or vmread. Unless I update the VMCS shadowing bitmap - which
> > I did for vmwrite and vmread to get a better view of this. It never
> > traps on VIRTUAL_APIC_PAGE_ADDR accesses. It does trap on: 
> > VIRTUAL_PROCESSOR_ID,
> > VM_EXIT_MSR_LOAD_ADDR and GUEST_[ES,DS,FS,GS,TR]_SELECTORS.
> > 
> > (It may also trap on IO_BITMAP_A,B but I didn't print that out).
> > 
> > To confirm that the VMCS that will be given to the L2 guest is correct
> > I added some printking of some states that ought to be pretty OK such
> > as HOST_RIP or HOST_RSP - which are all 0!
> 
> But did you also check what the field of interest starts out as?

I will do that.
> 
> > If I let the nvmx_update_virtual_apic_address keep on going without
> > modifying the VIRTUAL_APIC_PAGE_ADDR it later on crashes the nested
> > guest:
> > 
> > EN) nvmx_handle_vmwrite: 0  
> >  
> 
> The missing characters at the beginning may just be a copy-and-
> paste mistake, but they could also indicate a truncated log. Can
> you clarify which of the two it is?

Just an copy-n-paste error. Nothing of interest before there:
(d1)   NULL 
   
(d1) Booting from Hard Disk...  
   
(d1) Booting from :7c00 
   
(XEN) nvmx_handle_vmwrite: 0
   
(XEN) nvmx_handle_vmwrite: 0
..
> 
> > (XEN) nvmx_handle_vmwrite: 0
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 0
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 0
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 2008 
> >  
> > (XEN) nvmx_handle_vmwrite: 800  
> >  
> > (XEN) nvmx_handle_vmwrite: 804  
> >  
> > (XEN) nvmx_handle_vmwrite: 806  
> >  
> > (XEN) nvmx_handle_vmwrite: 80a  
> >  
> > (XEN) nvmx_handle_vmwrite: 80e  
> >  
> > (XEN) nvmx_update_virtual_apic_address: vCPU1 0x(vAPIC) 
> > 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0 
> 
> Assuming the field starts out as other than all ones, could you check
> its value on each of the intercepted VMWRITEs, to at least narrow
> when it changes.

Yes of course.
> 
> Kevin, Jun - are there any cases where the hardware would alter
> this field's value? Like during some guest side LAPIC manipulations?
> (The same monitoring as suggested during VMWRITEs could of
> course also be added to LAPIC accesses visible to the hypervisor,
> but I guess there won't be too many of those.)
> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-03 Thread Jan Beulich
>>> On 02.02.16 at 23:05,  wrote:
> This is getting more and more bizzare.
> 
> I realized that this machine has VMCS shadowing so Xen does not trap on
> any vmwrite or vmread. Unless I update the VMCS shadowing bitmap - which
> I did for vmwrite and vmread to get a better view of this. It never
> traps on VIRTUAL_APIC_PAGE_ADDR accesses. It does trap on: 
> VIRTUAL_PROCESSOR_ID,
> VM_EXIT_MSR_LOAD_ADDR and GUEST_[ES,DS,FS,GS,TR]_SELECTORS.
> 
> (It may also trap on IO_BITMAP_A,B but I didn't print that out).
> 
> To confirm that the VMCS that will be given to the L2 guest is correct
> I added some printking of some states that ought to be pretty OK such
> as HOST_RIP or HOST_RSP - which are all 0!

But did you also check what the field of interest starts out as?

> If I let the nvmx_update_virtual_apic_address keep on going without
> modifying the VIRTUAL_APIC_PAGE_ADDR it later on crashes the nested
> guest:
> 
> EN) nvmx_handle_vmwrite: 0   

The missing characters at the beginning may just be a copy-and-
paste mistake, but they could also indicate a truncated log. Can
you clarify which of the two it is?

> (XEN) nvmx_handle_vmwrite: 0 
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 0 
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 0 
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 2008  
> (XEN) nvmx_handle_vmwrite: 800   
> (XEN) nvmx_handle_vmwrite: 804   
> (XEN) nvmx_handle_vmwrite: 806   
> (XEN) nvmx_handle_vmwrite: 80a   
> (XEN) nvmx_handle_vmwrite: 80e   
> (XEN) nvmx_update_virtual_apic_address: vCPU1 0x(vAPIC) 
> 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0 

Assuming the field starts out as other than all ones, could you check
its value on each of the intercepted VMWRITEs, to at least narrow
when it changes.

Kevin, Jun - are there any cases where the hardware would alter
this field's value? Like during some guest side LAPIC manipulations?
(The same monitoring as suggested during VMWRITEs could of
course also be added to LAPIC accesses visible to the hypervisor,
but I guess there won't be too many of those.)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-02-02 Thread Konrad Rzeszutek Wilk
On Mon, Jan 18, 2016 at 02:41:52AM -0700, Jan Beulich wrote:
> >>> On 15.01.16 at 22:39,  wrote:
> > On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
> >> Since we can (I hope) pretty much exclude a paging type, the
> >> ASSERT() must have triggered because of vapic_pg being NULL.
> >> That might be verifiable without extra printk()s, just by checking
> >> the disassembly (assuming the value sits in a register). In which
> >> case vapic_gpfn would be of interest too.
> > 
> > The vapic_gpfn is 0x.
> > 
> > To be exact:
> > 
> > nvmx_update_virtual_apic_address:vCPU0 0x(vAPIC) 0x0(APIC), 
> > 0x0(TPR) ctrl=b5b9effe
> > 
> > Based on this:
> > 
> > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> > index cb6f9b8..8a0abfc 100644
> > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct 
> > vcpu *v)
> >  
> >  vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) 
> > >> PAGE_SHIFT;
> >  vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, 
> > P2M_ALLOC);
> > -ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> > +   if ( !vapic_pg ) {
> > +   printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) 
> > ctrl=%x\n", __func__,v->vcpu_id,
> > +   __get_vvmcs(nvcpu->nv_vvmcx, 
> > VIRTUAL_APIC_PAGE_ADDR),
> > +   __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
> > +   __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
> > +   ctrl);
> > +   }
> > +ASSERT(vapic_pg);
> > +   ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> >  __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
> >  put_page(vapic_pg);
> >  }
> 
> Interesting: I can't see VIRTUAL_APIC_PAGE_ADDR to be written
> with all ones anywhere, neither for the real VMCS nor for the virtual
> one (page_to_maddr() can't, afaict, return such a value). Could you
> check where the L1 guest itself is writing that value, or whether it
> fails to initialize that field and it happens to start out as all ones?

This is getting more and more bizzare.

I realized that this machine has VMCS shadowing so Xen does not trap on
any vmwrite or vmread. Unless I update the VMCS shadowing bitmap - which
I did for vmwrite and vmread to get a better view of this. It never
traps on VIRTUAL_APIC_PAGE_ADDR accesses. It does trap on: VIRTUAL_PROCESSOR_ID,
VM_EXIT_MSR_LOAD_ADDR and GUEST_[ES,DS,FS,GS,TR]_SELECTORS.

(It may also trap on IO_BITMAP_A,B but I didn't print that out).

To confirm that the VMCS that will be given to the L2 guest is correct
I added some printking of some states that ought to be pretty OK such
as HOST_RIP or HOST_RSP - which are all 0!

If I let the nvmx_update_virtual_apic_address keep on going without
modifying the VIRTUAL_APIC_PAGE_ADDR it later on crashes the nested
guest:

EN) nvmx_handle_vmwrite: 0
(XEN) nvmx_handle_vmwrite: 0
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 0
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 0
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 2008 
(XEN) nvmx_handle_vmwrite: 800  
(XEN) nvmx_handle_vmwrite: 804  
(XEN) nvmx_handle_vmwrite: 806  
(XEN) nvmx_handle_vmwrite: 80a  
(XEN) nvmx_handle_vmwrite: 80e  
(XEN) nvmx_update_virtual_apic_address: vCPU1 0x(vAPIC) 
0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0 
(XEN) nvmx_update_virtual_apic_address: TPR threshold = 0x0 updated 0.  
(XEN) nvmx_update_virtual_apic_address: Virtual APIC = 0x0 updated 0.   
(XEN) nvmx_update_virtual_apic_address: APIC address = 0x0 updated 0.   
(XEN) HOST_RIP=0x0 HOST_RSP=0x0 
(XEN)  error code 7 
(XEN) domain_crash_sync called from vmcs.c:1597 
(XEN) Domain 1 (vcpu#1) crashed on cpu#37:  

Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-01-18 Thread Jan Beulich
>>> On 15.01.16 at 22:39,  wrote:
> On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
>> Since we can (I hope) pretty much exclude a paging type, the
>> ASSERT() must have triggered because of vapic_pg being NULL.
>> That might be verifiable without extra printk()s, just by checking
>> the disassembly (assuming the value sits in a register). In which
>> case vapic_gpfn would be of interest too.
> 
> The vapic_gpfn is 0x.
> 
> To be exact:
> 
> nvmx_update_virtual_apic_address:vCPU0 0x(vAPIC) 0x0(APIC), 
> 0x0(TPR) ctrl=b5b9effe
> 
> Based on this:
> 
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index cb6f9b8..8a0abfc 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu 
> *v)
>  
>  vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> 
> PAGE_SHIFT;
>  vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, 
> P2M_ALLOC);
> -ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> +   if ( !vapic_pg ) {
> +   printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) 
> ctrl=%x\n", __func__,v->vcpu_id,
> +   __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
> +   __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
> +   __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
> +   ctrl);
> +   }
> +ASSERT(vapic_pg);
> +   ASSERT(vapic_pg && !p2m_is_paging(p2mt));
>  __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
>  put_page(vapic_pg);
>  }

Interesting: I can't see VIRTUAL_APIC_PAGE_ADDR to be written
with all ones anywhere, neither for the real VMCS nor for the virtual
one (page_to_maddr() can't, afaict, return such a value). Could you
check where the L1 guest itself is writing that value, or whether it
fails to initialize that field and it happens to start out as all ones?

>> What looks odd to me is the connection between
>> CPU_BASED_TPR_SHADOW being set and the use of a (valid)
>> virtual APIC page: Wouldn't this rather need to depend on
>> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
>> nvmx_update_apic_access_address()?
> 
> Could be. I added in an read for the secondary control:
> 
> nvmx_update_virtual_apic_address:vCPU2 0x(vAPIC) 0x0(APIC), 
> 0x0(TPR) ctrl=b5b9effe sec=0
> 
> So trying your recommendation:
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index cb6f9b8..d291c91 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu 
> *v)
>  struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
>  u32 ctrl;
>  
> -ctrl = __n2_exec_control(v);
> -if ( ctrl & CPU_BASED_TPR_SHADOW )
> +ctrl = __n2_secondary_exec_control(v);
> +if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES )
>  {
>  p2m_type_t p2mt;
>  unsigned long vapic_gpfn;
> 
> 
> Got me:
> (XEN) stdvga.c:151:d1v0 leaving stdvga mode
> (XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
> (XEN) stdvga.c:520:d1v0 leaving caching mode
> (XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 8021.
> (XEN) Failed vm entry (exit reason 0x8021) caused by invalid guest state 

Interesting. I've just noticed that a similar odd looking (to me)
dependency exists in construct_vmcs(). Perhaps I've overlooked
something in the SDM. In any event I think some words from the
VMX maintainers would be quite nice here.

Sadly the VMCS dump doesn't include the two APIC related
addresses...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-01-15 Thread Konrad Rzeszutek Wilk
On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
> >>> On 12.01.16 at 04:38,  wrote:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
> > (XEN) CPU:39
> > (XEN) RIP:e008:[] virtual_vmentry+0x487/0xac9
> > (XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d1v3)
> > (XEN) rax:    rbx: 83007786c000   rcx: 
> > (XEN) rdx: 0e00   rsi: 000f   rdi: 83407f81e010
> > (XEN) rbp: 834008a47ea8   rsp: 834008a47e38   r8: 
> > (XEN) r9:     r10:    r11: 
> > (XEN) r12:    r13: 82c000341000   r14: 834008a47f18
> > (XEN) r15: 83407f7c4000   cr0: 80050033   cr4: 001526e0
> > (XEN) cr3: 00407fb22000   cr2: 
> > (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> > (XEN) Xen stack trace from rsp=834008a47e38:
> > (XEN)834008a47e68 82d0801d2cde 834008a47e68 0d00
> > (XEN)  834008a47e88 0004801cc30e
> > (XEN)83007786c000 83007786c000 834008a4 
> > (XEN)834008a47f18  834008a47f08 82d0801edf94
> > (XEN)834008a47ef8  834008f62000 834008a47f18
> > (XEN)00ae8c99eb8d 83007786c000  
> > (XEN)   82d0801ee2ab
> > (XEN)   
> > (XEN)   
> > (XEN)   
> > (XEN)078bfbff   beefbeef
> > (XEN)fc4b3440 00bfbeef 00040046 fc607f00
> > (XEN)beef beef beef beef
> > (XEN)beef 0027 83007786c000 006f88716300
> > (XEN)
> > (XEN) Xen call trace:
> > (XEN)[] virtual_vmentry+0x487/0xac9
> > (XEN)[] nvmx_switch_guest+0x8ff/0x915
> > (XEN)[] vmx_asm_vmexit_handler+0x4b/0xc0
> > (XEN)
> > (XEN)
> > (XEN) 
> > (XEN) Panic on CPU 39:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) 
> > (XEN)
> > 
> > ..and then to my surprise the hypervisor stopped hitting this.
> 
> Since we can (I hope) pretty much exclude a paging type, the
> ASSERT() must have triggered because of vapic_pg being NULL.
> That might be verifiable without extra printk()s, just by checking
> the disassembly (assuming the value sits in a register). In which
> case vapic_gpfn would be of interest too.

The vapic_gpfn is 0x.

To be exact:

nvmx_update_virtual_apic_address:vCPU0 0x(vAPIC) 0x0(APIC), 
0x0(TPR) ctrl=b5b9effe

Based on this:

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..8a0abfc 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu 
*v)
 
 vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> 
PAGE_SHIFT;
 vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, P2M_ALLOC);
-ASSERT(vapic_pg && !p2m_is_paging(p2mt));
+   if ( !vapic_pg ) {
+   printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) 
ctrl=%x\n", __func__,v->vcpu_id,
+   __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
+   __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
+   __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
+   ctrl);
+   }
+ASSERT(vapic_pg);
+   ASSERT(vapic_pg && !p2m_is_paging(p2mt));
 __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
 put_page(vapic_pg);
 }

> 
> What looks odd to me is the connection between
> CPU_BASED_TPR_SHADOW being set and the use of a (valid)
> virtual APIC page: Wouldn't this rather need to depend on
> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
> nvmx_update_apic_access_address()?

Could be. I added in an read for the secondary control:

nvmx_update_virtual_apic_address:vCPU2 0x(vAPIC) 0x0(APIC), 
0x0(TPR) ctrl=b5b9effe sec=0

So trying your recommendation:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..d291c91 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
 struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
 u32 ctrl;
 
-ctrl = __n2_ex

Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-01-12 Thread Alvin Starr

Insure that memory and maxmem are set to the same value.



On 01/11/2016 10:38 PM, Konrad Rzeszutek Wilk wrote:

Hey,

The machine is an X5-2 which is a Haswell based E5-2699 v3.

We are trying to launch to use the nested virtualization. The
guest is a simple VMware vSphere 6.0 with 32GB, 8 CPUs.

The guest than that is launched within VMware is a 2 VCPU 2GB Linux
(OEL6 to be exact). During its bootup Xen crashes with this assert.

Oddly enough if this is repeated on a workstation Ivy Bridge CPU (i5-3570)
it works fine.

Disabling APICv (apicv=0) on the Xen command line did not help.

I added some debug code to see if the vapic_pg is bad and what
the p2mt type is [read below]


Serial console started.  To stop, type ESC (
(XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
(XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
(XEN) CPU:39
(XEN) RIP:e008:[] virtual_vmentry+0x487/0xac9
(XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d1v3)
(XEN) rax:    rbx: 83007786c000   rcx: 
(XEN) rdx: 0e00   rsi: 000f   rdi: 83407f81e010
(XEN) rbp: 834008a47ea8   rsp: 834008a47e38   r8: 
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13: 82c000341000   r14: 834008a47f18
(XEN) r15: 83407f7c4000   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 00407fb22000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=834008a47e38:
(XEN)834008a47e68 82d0801d2cde 834008a47e68 0d00
(XEN)  834008a47e88 0004801cc30e
(XEN)83007786c000 83007786c000 834008a4 
(XEN)834008a47f18  834008a47f08 82d0801edf94
(XEN)834008a47ef8  834008f62000 834008a47f18
(XEN)00ae8c99eb8d 83007786c000  
(XEN)   82d0801ee2ab
(XEN)   
(XEN)   
(XEN)   
(XEN)078bfbff   beefbeef
(XEN)fc4b3440 00bfbeef 00040046 fc607f00
(XEN)beef beef beef beef
(XEN)beef 0027 83007786c000 006f88716300
(XEN)
(XEN) Xen call trace:
(XEN)[] virtual_vmentry+0x487/0xac9
(XEN)[] nvmx_switch_guest+0x8ff/0x915
(XEN)[] vmx_asm_vmexit_handler+0x4b/0xc0
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 39:
(XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
(XEN) 
(XEN)

..and then to my surprise the hypervisor stopped hitting this. Instead
I started getting an even more bizzare crash:


(d1) enter handle_19:
(d1)   NULL
(d1) Booting from Hard Disk...
(d1) Booting from :7c00
(XEN) stdvga.c:151:d1v0 leaving stdvga mode
(XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
(XEN) stdvga.c:520:d1v0 leaving caching mode
(XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
(XEN) CPU:3
(XEN) RIP:e008:[] vmx_cpu_up+0xacc/0xba5
(XEN) RFLAGS: 00010242   CONTEXT: hypervisor (d1v1)
(XEN) rax:    rbx: 830077877000   rcx: 834077e54000
(XEN) rdx: 834007dc8000   rsi: 2000   rdi: 830077877000
(XEN) rbp: 834007dcfc48   rsp: 834007dcfc38   r8:  0404
(XEN) r9:  000ff000   r10:    r11: fc423f1e
(XEN) r12: 2000   r13:    r14: 
(XEN) r15:    cr0: 80050033   cr4: 001526e0
(XEN) cr3: 004000763000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=834007dcfc38:
(XEN)834007dcfc98  834007dcfc68 82d0801e2533
(XEN)830077877000 2000 834007dcfc78 82d0801ea933
(XEN)834007dcfca8 82d0801eaae4  830077877000
(XEN) 834007dcff18 834007dcfd08 82d0801eb983
(XEN)8341 00013692c000 8340 fc607f28
(XEN)0008 8346 834007dcff18 830077877000
(XEN)0015  834007dcff08 82d0801e8c8d
(XEN)834007763000 8300778c2000 8340007c3000 834007dcfd50
(XEN)82d0801e120b 834007dcfd50 830077877000 834007dcfdf0
(XEN)  82d08012fe0b

Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-01-12 Thread Jan Beulich
>>> On 12.01.16 at 04:38,  wrote:
> (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> (XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
> (XEN) CPU:39
> (XEN) RIP:e008:[] virtual_vmentry+0x487/0xac9
> (XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d1v3)
> (XEN) rax:    rbx: 83007786c000   rcx: 
> (XEN) rdx: 0e00   rsi: 000f   rdi: 83407f81e010
> (XEN) rbp: 834008a47ea8   rsp: 834008a47e38   r8: 
> (XEN) r9:     r10:    r11: 
> (XEN) r12:    r13: 82c000341000   r14: 834008a47f18
> (XEN) r15: 83407f7c4000   cr0: 80050033   cr4: 001526e0
> (XEN) cr3: 00407fb22000   cr2: 
> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> (XEN) Xen stack trace from rsp=834008a47e38:
> (XEN)834008a47e68 82d0801d2cde 834008a47e68 0d00
> (XEN)  834008a47e88 0004801cc30e
> (XEN)83007786c000 83007786c000 834008a4 
> (XEN)834008a47f18  834008a47f08 82d0801edf94
> (XEN)834008a47ef8  834008f62000 834008a47f18
> (XEN)00ae8c99eb8d 83007786c000  
> (XEN)   82d0801ee2ab
> (XEN)   
> (XEN)   
> (XEN)   
> (XEN)078bfbff   beefbeef
> (XEN)fc4b3440 00bfbeef 00040046 fc607f00
> (XEN)beef beef beef beef
> (XEN)beef 0027 83007786c000 006f88716300
> (XEN)
> (XEN) Xen call trace:
> (XEN)[] virtual_vmentry+0x487/0xac9
> (XEN)[] nvmx_switch_guest+0x8ff/0x915
> (XEN)[] vmx_asm_vmexit_handler+0x4b/0xc0
> (XEN)
> (XEN)
> (XEN) 
> (XEN) Panic on CPU 39:
> (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> (XEN) 
> (XEN)
> 
> ..and then to my surprise the hypervisor stopped hitting this.

Since we can (I hope) pretty much exclude a paging type, the
ASSERT() must have triggered because of vapic_pg being NULL.
That might be verifiable without extra printk()s, just by checking
the disassembly (assuming the value sits in a register). In which
case vapic_gpfn would be of interest too.

What looks odd to me is the connection between
CPU_BASED_TPR_SHADOW being set and the use of a (valid)
virtual APIC page: Wouldn't this rather need to depend on
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
nvmx_update_apic_access_address()?

Anyway, the writing of the respective VMCS field to zero in the
alternative worries me a little: Aren't we risking MFN zero to be
wrongly accessed due to this?

Furthermore, nvmx_update_apic_access_address() having a
similar ASSERT() seems entirely wrong: The APIC access
page doesn't really need to match up with any actual page
belonging to the guest - a guest could choose to point this
into no-where (note that we've been at least considering this
option recently for our own purposes, in the context of
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02191.html).

> Instead I started getting an even more bizzare crash:
> 
> 
> (d1) enter handle_19:
> (d1)   NULL
> (d1) Booting from Hard Disk...
> (d1) Booting from :7c00
> (XEN) stdvga.c:151:d1v0 leaving stdvga mode
> (XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
> (XEN) stdvga.c:520:d1v0 leaving caching mode
> (XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
> (XEN) CPU:3
> (XEN) RIP:e008:[] vmx_cpu_up+0xacc/0xba5
> (XEN) RFLAGS: 00010242   CONTEXT: hypervisor (d1v1)
> (XEN) rax:    rbx: 830077877000   rcx: 834077e54000
> (XEN) rdx: 834007dc8000   rsi: 2000   rdi: 830077877000
> (XEN) rbp: 834007dcfc48   rsp: 834007dcfc38   r8:  0404
> (XEN) r9:  000ff000   r10:    r11: fc423f1e
> (XEN) r12: 2000   r13:    r14: 
> (XEN) r15:    cr0: 80050033   cr4: 001526e0
> (XEN) cr3: 004000763000   cr2: 
> (XEN) ds:    es:    fs:    gs:    ss:    cs: e008
> (XEN) Xen stack trace from rsp=834007dcfc38:
> (XEN)834007dcfc98  834007dcfc68 82d0801e2533
> (XEN)830077877000 2000 834007dcfc78 82d0801ea933
> (XEN)834007dcfca8 f

[Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6

2016-01-11 Thread Konrad Rzeszutek Wilk
Hey,

The machine is an X5-2 which is a Haswell based E5-2699 v3.

We are trying to launch to use the nested virtualization. The
guest is a simple VMware vSphere 6.0 with 32GB, 8 CPUs.

The guest than that is launched within VMware is a 2 VCPU 2GB Linux
(OEL6 to be exact). During its bootup Xen crashes with this assert.

Oddly enough if this is repeated on a workstation Ivy Bridge CPU (i5-3570)
it works fine.

Disabling APICv (apicv=0) on the Xen command line did not help.

I added some debug code to see if the vapic_pg is bad and what
the p2mt type is [read below]


Serial console started.  To stop, type ESC (
(XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
(XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
(XEN) CPU:39
(XEN) RIP:e008:[] virtual_vmentry+0x487/0xac9
(XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d1v3)
(XEN) rax:    rbx: 83007786c000   rcx: 
(XEN) rdx: 0e00   rsi: 000f   rdi: 83407f81e010
(XEN) rbp: 834008a47ea8   rsp: 834008a47e38   r8: 
(XEN) r9:     r10:    r11: 
(XEN) r12:    r13: 82c000341000   r14: 834008a47f18
(XEN) r15: 83407f7c4000   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 00407fb22000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=834008a47e38:
(XEN)834008a47e68 82d0801d2cde 834008a47e68 0d00
(XEN)  834008a47e88 0004801cc30e
(XEN)83007786c000 83007786c000 834008a4 
(XEN)834008a47f18  834008a47f08 82d0801edf94
(XEN)834008a47ef8  834008f62000 834008a47f18
(XEN)00ae8c99eb8d 83007786c000  
(XEN)   82d0801ee2ab
(XEN)   
(XEN)   
(XEN)   
(XEN)078bfbff   beefbeef
(XEN)fc4b3440 00bfbeef 00040046 fc607f00
(XEN)beef beef beef beef
(XEN)beef 0027 83007786c000 006f88716300
(XEN)
(XEN) Xen call trace:
(XEN)[] virtual_vmentry+0x487/0xac9
(XEN)[] nvmx_switch_guest+0x8ff/0x915
(XEN)[] vmx_asm_vmexit_handler+0x4b/0xc0
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 39:
(XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
(XEN) 
(XEN)

..and then to my surprise the hypervisor stopped hitting this. Instead
I started getting an even more bizzare crash:


(d1) enter handle_19:
(d1)   NULL
(d1) Booting from Hard Disk...
(d1) Booting from :7c00
(XEN) stdvga.c:151:d1v0 leaving stdvga mode
(XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
(XEN) stdvga.c:520:d1v0 leaving caching mode
(XEN) [ Xen-4.6.0  x86_64  debug=y  Tainted:C ]
(XEN) CPU:3
(XEN) RIP:e008:[] vmx_cpu_up+0xacc/0xba5
(XEN) RFLAGS: 00010242   CONTEXT: hypervisor (d1v1)
(XEN) rax:    rbx: 830077877000   rcx: 834077e54000
(XEN) rdx: 834007dc8000   rsi: 2000   rdi: 830077877000
(XEN) rbp: 834007dcfc48   rsp: 834007dcfc38   r8:  0404
(XEN) r9:  000ff000   r10:    r11: fc423f1e
(XEN) r12: 2000   r13:    r14: 
(XEN) r15:    cr0: 80050033   cr4: 001526e0
(XEN) cr3: 004000763000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen stack trace from rsp=834007dcfc38:
(XEN)834007dcfc98  834007dcfc68 82d0801e2533
(XEN)830077877000 2000 834007dcfc78 82d0801ea933
(XEN)834007dcfca8 82d0801eaae4  830077877000
(XEN) 834007dcff18 834007dcfd08 82d0801eb983
(XEN)8341 00013692c000 8340 fc607f28
(XEN)0008 8346 834007dcff18 830077877000
(XEN)0015  834007dcff08 82d0801e8c8d
(XEN)834007763000 8300778c2000 8340007c3000 834007dcfd50
(XEN)82d0801e120b 834007dcfd50 830077877000 834007dcfdf0
(XEN)  82d08012fe0b 834007dfcac0
(XEN)834007dd30e8 0086 834007dcfda0 82d08012d4c2
(XEN)834