Re: List of unaccessible x86 states

2009-10-26 Thread Joerg Roedel
On Mon, Oct 26, 2009 at 12:56:31PM +0200, Avi Kivity wrote:
> On 10/26/2009 12:45 PM, Joerg Roedel wrote:


> >* nested intercepts
> 
> These are part of the guest vmcb.  The host nested intercepts can be
> recalculated, no?
> 
> >* for nested nested paging: guest nested cr3 value
> 
> Part of the guest vmcb.

This will work is most cases. But its not architecturally sane because
real hardware caches this information in the cpu. So software is free to
modify the vmcb without impacting the in-cpu state until the next
#vmexit. I don't know any software which relies on that so it may be not
an issue.
 
> >Off-topic question: Will the new migration protocol include some kind
> >handshake to find out if migration is possible at all?
> >
> 
> It's assumed that migration always works for a newer qemu version,
> and that the management tools don't attempt backward migration.

I think such a handshake would make sense to just prevent that a nested
svm hypervisor is migrated to an intel machine or vice versa (just an
example, there are more like sse*, nested nested paging, ...).

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/26/2009 12:45 PM, Joerg Roedel wrote:


Ok, parts of the state can be saved in guest memory. But thats
currently not done. This will need some care to not introduce a security
hole. But it shouldn't be too difficult.
The state thats not reproducible in an sane way is the intercept bitmap
for the l2 guest.
 From the nested state what needs to be exposed to userspace for
migration is:

* guest mode flag (as returned by is_nested)
* nested vmcb address
   


Yes, forgot that.  We can store it in the hsave area (note the hsave 
area format becomes an ABI).



* nested hsave msr
   


That's already saved.


* nested intercepts
   


These are part of the guest vmcb.  The host nested intercepts can be 
recalculated, no?



* for nested nested paging: guest nested cr3 value
   


Part of the guest vmcb.


Another state which needs exposure is the last branch record related
state.
   


Aren't those just more MSRs?


Off-topic question: Will the new migration protocol include some kind
handshake to find out if migration is possible at all?

   


It's assumed that migration always works for a newer qemu version, and 
that the management tools don't attempt backward migration.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Joerg Roedel
On Mon, Oct 26, 2009 at 12:09:25PM +0200, Avi Kivity wrote:
> On 10/26/2009 11:56 AM, Joerg Roedel wrote:
> >On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote:
> >>On 10/26/2009 11:30 AM, Joerg Roedel wrote:
> Which host state?  As far as I can tell, it can all be regenerated.
> >>>The state which is loaded into the vcpu when a #vmexit is emulated. This
> >>>includes segments, control registers and the host rip for example.
> >>All of this state does not change between nested guest and normal
> >>guest mode.
> >I am talking about all the state that is saved in svm->nested.hsave.
> >When we migrate a guest vcpu while it is running in guest mode itself
> >(without forcing a nested #vmexit) this state is required when a #vmexit
> >needs to be emulated on this vcpu after migration.
> >Same is true for the nested intercept conditions.
> 
> The state that is saved by VMRUN can be saved to guest memory and
> migrated.  Extra state (like the intercepts for the previous mode)
> must be saved to host memory and not migrated; host intercepts can
> be regenerated.

Ok, parts of the state can be saved in guest memory. But thats
currently not done. This will need some care to not introduce a security
hole. But it shouldn't be too difficult.
The state thats not reproducible in an sane way is the intercept bitmap
for the l2 guest.
>From the nested state what needs to be exposed to userspace for
migration is:

* guest mode flag (as returned by is_nested)
* nested vmcb address
* nested hsave msr
* nested intercepts
* for nested nested paging: guest nested cr3 value

Another state which needs exposure is the last branch record related
state.

Off-topic question: Will the new migration protocol include some kind
   handshake to find out if migration is possible at all?

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/26/2009 11:56 AM, Joerg Roedel wrote:

On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote:
   

On 10/26/2009 11:30 AM, Joerg Roedel wrote:
 
   

Which host state?  As far as I can tell, it can all be regenerated.
 

The state which is loaded into the vcpu when a #vmexit is emulated. This
includes segments, control registers and the host rip for example.
   

All of this state does not change between nested guest and normal
guest mode.
 

I am talking about all the state that is saved in svm->nested.hsave.
When we migrate a guest vcpu while it is running in guest mode itself
(without forcing a nested #vmexit) this state is required when a #vmexit
needs to be emulated on this vcpu after migration.
Same is true for the nested intercept conditions.
   


The state that is saved by VMRUN can be saved to guest memory and 
migrated.  Extra state (like the intercepts for the previous mode) must 
be saved to host memory and not migrated; host intercepts can be 
regenerated.


Concretely:


hsave->save.es = vmcb->save.es;
hsave->save.cs = vmcb->save.cs;
hsave->save.ss = vmcb->save.ss;
hsave->save.ds = vmcb->save.ds;
hsave->save.gdtr   = vmcb->save.gdtr;
hsave->save.idtr   = vmcb->save.idtr;
hsave->save.efer   = svm->vcpu.arch.shadow_efer;
hsave->save.cr0= svm->vcpu.arch.cr0;
hsave->save.cr4= svm->vcpu.arch.cr4;
hsave->save.rflags = vmcb->save.rflags;
hsave->save.rip= svm->next_rip;
hsave->save.rsp= vmcb->save.rsp;
hsave->save.rax= vmcb->save.rax;
if (npt_enabled)
hsave->save.cr3= vmcb->save.cr3;
else
hsave->save.cr3= svm->vcpu.arch.cr3;


Can all be saved to guest memory.

copy_vmcb_control_area(hsave, vmcb);

Must not be saved into guest memory.  On the other hand, it is not 
needed for migration.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Joerg Roedel
On Mon, Oct 26, 2009 at 11:39:46AM +0200, Avi Kivity wrote:
> On 10/26/2009 11:30 AM, Joerg Roedel wrote:
> >
> >>Which host state?  As far as I can tell, it can all be regenerated.
> >The state which is loaded into the vcpu when a #vmexit is emulated. This
> >includes segments, control registers and the host rip for example.
> 
> All of this state does not change between nested guest and normal
> guest mode.

I am talking about all the state that is saved in svm->nested.hsave.
When we migrate a guest vcpu while it is running in guest mode itself
(without forcing a nested #vmexit) this state is required when a #vmexit
needs to be emulated on this vcpu after migration.
Same is true for the nested intercept conditions.

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/26/2009 11:30 AM, Joerg Roedel wrote:



Which host state?  As far as I can tell, it can all be regenerated.
 

The state which is loaded into the vcpu when a #vmexit is emulated. This
includes segments, control registers and the host rip for example.
   


All of this state does not change between nested guest and normal guest 
mode.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Joerg Roedel
On Mon, Oct 26, 2009 at 11:21:12AM +0200, Avi Kivity wrote:
> On 10/26/2009 11:17 AM, Joerg Roedel wrote:
> >On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote:
> >>On 10/24/2009 12:35 PM, Alexander Graf wrote:
> >>>Hm, thinking about this again, it might be useful to have an
> >>>"currently in nested VM" flag here. That way userspace can decide
> >>>if it needs to get out of the nested state (for migration) or if
> >>>it just doesn't care.
> >>Getting out of nested state involves modifying state (both memory
> >>and registers).  Nor can we in the general case force it.  The guest
> >>can set up a situation where it is impossible to #vmexit.
> >There is actually more than that. If the guest runs in guest mode itself
> >we also need to report the host state to be able to do an #vmexit after
> >migration.
> >In nested SVM the host state is not saved in the guest memory to prevent
> >the guest from modifying it and break out of its virtualization jail.
> 
> Which host state?  As far as I can tell, it can all be regenerated.

The state which is loaded into the vcpu when a #vmexit is emulated. This
includes segments, control registers and the host rip for example.

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/26/2009 11:17 AM, Joerg Roedel wrote:

On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote:
   

On 10/24/2009 12:35 PM, Alexander Graf wrote:
 

Hm, thinking about this again, it might be useful to have an
"currently in nested VM" flag here. That way userspace can decide
if it needs to get out of the nested state (for migration) or if
it just doesn't care.
   

Getting out of nested state involves modifying state (both memory
and registers).  Nor can we in the general case force it.  The guest
can set up a situation where it is impossible to #vmexit.
 

There is actually more than that. If the guest runs in guest mode itself
we also need to report the host state to be able to do an #vmexit after
migration.
In nested SVM the host state is not saved in the guest memory to prevent
the guest from modifying it and break out of its virtualization jail.
   


Which host state?  As far as I can tell, it can all be regenerated.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/26/2009 11:11 AM, Alexander Graf wrote:
L1 hsave stores the architected state saved by vmrun, e.g. cs.sel, 
next_rip, cr0, cr3, etc.  The host intercept bitmap is not state 
since it is calculated from the L1 intercept bitmap and host code.  
Indeed it can be different from host to host even with the same guest 
state.



Ah, so you'd only save off the cpu state parts of the vmcb.

Currently we save off control parts too, so we can easily swap them in 
on #vmexit.


These can still be saved in a host memory area as an optimization, and 
regenerated if needed.


So if we'd migrate off when inside the nested guest, we'd have to save 
off the resume control state, OR them again with the guest vmcb 
control states and be inside the nested guest.


Right, if the new state bit (guest mode) is set, we look at the control 
bits and OR them into the vmcb.  That part can be reused with the VMRUN 
code.




Wouldn't it be much easier to not migrate / save state when inside a 
nested guest? I'm afraid the code will become overly complex if we do 
allow migration while in a nested context.


I can't really see why but then I don't know the code as well as you 
do.  The current code won't work for guests which don't intercept 
external interrupts (probably only malware).  For nested vmx it may be 
necessary since vmx has a mode where interrupts are acknowledged during 
#VMEXIT and the interrupt vector is saved into a register; you can't 
fake an interrupt #VMEXIT since you can't fake the vector.  Xen is one 
guest which uses this mode.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Joerg Roedel
On Sun, Oct 25, 2009 at 11:49:35AM +0200, Avi Kivity wrote:
> On 10/24/2009 12:35 PM, Alexander Graf wrote:
> >
> >Hm, thinking about this again, it might be useful to have an
> >"currently in nested VM" flag here. That way userspace can decide
> >if it needs to get out of the nested state (for migration) or if
> >it just doesn't care.
> 
> Getting out of nested state involves modifying state (both memory
> and registers).  Nor can we in the general case force it.  The guest
> can set up a situation where it is impossible to #vmexit.

There is actually more than that. If the guest runs in guest mode itself
we also need to report the host state to be able to do an #vmexit after
migration.
In nested SVM the host state is not saved in the guest memory to prevent
the guest from modifying it and break out of its virtualization jail.

Joerg


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Alexander Graf


Am 26.10.2009 um 09:33 schrieb Avi Kivity :


On 10/25/2009 06:45 PM, Alexander Graf wrote:
It's not. We can't use the guest memory for hsave because then  
the guest could break the l1 state, so a malicious hypervisor  
could break us.


Guest hsave should be used for storing guest state when switching  
into the nested guest, not host state.  Host state is not part of  
the save/restore state in any case.



No it's not.

When going in an l2 guest, we need to save the l1 state in the  
hsave. Now if we'd use the l1 given hsave, the l2 guest could  
modify the hsave.


That means the l2 guest could rewrite the intercept bitmap to 0 and  
compromize the host.


L1 hsave stores the architected state saved by vmrun, e.g. cs.sel,  
next_rip, cr0, cr3, etc.  The host intercept bitmap is not state  
since it is calculated from the L1 intercept bitmap and host code.   
Indeed it can be different from host to host even with the same  
guest state.


Ah, so you'd only save off the cpu state parts of the vmcb.

Currently we save off control parts too, so we can easily swap them in  
on #vmexit.


So if we'd migrate off when inside the nested guest, we'd have to save  
off the resume control state, OR them again with the guest vmcb  
control states and be inside the nested guest.


Wouldn't it be much easier to not migrate / save state when inside a  
nested guest? I'm afraid the code will become overly complex if we do  
allow migration while in a nested context.


Alex



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-26 Thread Avi Kivity

On 10/25/2009 06:45 PM, Alexander Graf wrote:
It's not. We can't use the guest memory for hsave because then the 
guest could break the l1 state, so a malicious hypervisor could 
break us.


Guest hsave should be used for storing guest state when switching 
into the nested guest, not host state.  Host state is not part of the 
save/restore state in any case.



No it's not.

When going in an l2 guest, we need to save the l1 state in the hsave. 
Now if we'd use the l1 given hsave, the l2 guest could modify the hsave.


That means the l2 guest could rewrite the intercept bitmap to 0 and 
compromize the host.


L1 hsave stores the architected state saved by vmrun, e.g. cs.sel, 
next_rip, cr0, cr3, etc.  The host intercept bitmap is not state since 
it is calculated from the L1 intercept bitmap and host code.  Indeed it 
can be different from host to host even with the same guest state.



That's why we're storing the hsave data in a host allocated page.

Of course, we could save the whole hsave are off to the host on 
migeation...


Sorry, -ENOPARSE.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-25 Thread Alexander Graf


Am 25.10.2009 um 15:08 schrieb Avi Kivity :


On 10/25/2009 03:53 PM, Alexander Graf wrote:


Am 25.10.2009 um 10:46 schrieb Avi Kivity :


On 10/20/2009 09:23 PM, Alexander Graf wrote:


If the nested hypervisor doesn't intercept INTR we don't support  
it anyways.


That's a bug.


It's a question of how accurate we want to be.


Even if we don't implement it immediately, it's still a bug.  It  
won't matter much until we hit a guest that needs it.



Really, pushing the whole nesting state over is not a good idea.


Isn't the entire state just one bit?  Everything else should be  
saved to guest memory.


It's not. We can't use the guest memory for hsave because then the  
guest could break the l1 state, so a malicious hypervisor could  
break us.


Guest hsave should be used for storing guest state when switching  
into the nested guest, not host state.  Host state is not part of  
the save/restore state in any case.


No it's not.

When going in an l2 guest, we need to save the l1 state in the hsave.  
Now if we'd use the l1 given hsave, the l2 guest could modify the hsave.


That means the l2 guest could rewrite the intercept bitmap to 0 and  
compromize the host.


That's why we're storing the hsave data in a host allocated page.

Of course, we could save the whole hsave are off to the host on  
migeation...


Alex



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-25 Thread Avi Kivity

On 10/25/2009 03:53 PM, Alexander Graf wrote:


Am 25.10.2009 um 10:46 schrieb Avi Kivity :


On 10/20/2009 09:23 PM, Alexander Graf wrote:


If the nested hypervisor doesn't intercept INTR we don't support it 
anyways.


That's a bug.


It's a question of how accurate we want to be.


Even if we don't implement it immediately, it's still a bug.  It won't 
matter much until we hit a guest that needs it.



Really, pushing the whole nesting state over is not a good idea.


Isn't the entire state just one bit?  Everything else should be saved 
to guest memory.


It's not. We can't use the guest memory for hsave because then the 
guest could break the l1 state, so a malicious hypervisor could break us.


Guest hsave should be used for storing guest state when switching into 
the nested guest, not host state.  Host state is not part of the 
save/restore state in any case.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-25 Thread Alexander Graf


Am 25.10.2009 um 10:46 schrieb Avi Kivity :


On 10/20/2009 09:23 PM, Alexander Graf wrote:


If the nested hypervisor doesn't intercept INTR we don't support it  
anyways.


That's a bug.


It's a question of how accurate we want to be.




Really, pushing the whole nesting state over is not a good idea.


Isn't the entire state just one bit?  Everything else should be  
saved to guest memory.


It's not. We can't use the guest memory for hsave because then the  
guest could break the l1 state, so a malicious hypervisor could break  
us.


Alex



--
error compiling committee.c: too many arguments to function


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-25 Thread Avi Kivity

On 10/24/2009 12:35 PM, Alexander Graf wrote:


Hm, thinking about this again, it might be useful to have an 
"currently in nested VM" flag here. That way userspace can decide if 
it needs to get out of the nested state (for migration) or if it just 
doesn't care.


Getting out of nested state involves modifying state (both memory and 
registers).  Nor can we in the general case force it.  The guest can set 
up a situation where it is impossible to #vmexit.



- KVM_X86_VCPU_STATE_SVM
   o gif


Can we make this an "svm_flags" or so u32? And then we'd just set bits?



Or individual flags as u8s, so we don't get trapped into a specific 
encoding which is really an implementation detail.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-25 Thread Avi Kivity

On 10/20/2009 09:23 PM, Alexander Graf wrote:


If the nested hypervisor doesn't intercept INTR we don't support it 
anyways.


That's a bug.


Really, pushing the whole nesting state over is not a good idea.


Isn't the entire state just one bit?  Everything else should be saved to 
guest memory.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-24 Thread Alexander Graf


On 23.10.2009, at 21:34, Jan Kiszka wrote:


Jan Kiszka wrote:

Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile  
ATM,

this is an attempt to collect the precise requirements for additional
state fields. Once everyone feels the list is complete, we can decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.



Here is a wrap-up of what has been reported so far:

- NMI
   o nmi_masked
   o nmi_pending
   o nmi_injected
- queued exception
   o kvm_queued_exception
   o triple_fault
- SVM
   o gif
   (Are we sure that there is really nothing more here?)


Hm, thinking about this again, it might be useful to have an  
"currently in nested VM" flag here. That way userspace can decide if  
it needs to get out of the nested state (for migration) or if it just  
doesn't care.



- sipi_vector

So the next question is how to map these on substates. I'm currently
leaning towards this organization:

- KVM_X86_VCPU_STATE_EVENTS
   o NMI states
   o pending exception
   o sipi_vector
   o pending interrupt?
 (would be redundant to kvm_sregs.interrupt_bitmap, but that  
struct

 may be obsoleted one day)
- KVM_X86_VCPU_STATE_SVM
   o gif


Can we make this an "svm_flags" or so u32? And then we'd just set bits?

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-23 Thread Jan Kiszka
Jan Kiszka wrote:
> Hi all,
> 
> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
> this is an attempt to collect the precise requirements for additional
> state fields. Once everyone feels the list is complete, we can decide
> how to partition it into one ore more substates for the new
> KVM_GET/SET_VCPU_STATE interface.
> 
> What I read so far (or tried to patch already):
> 
> - nmi_masked
> - nmi_pending
> - nmi_injected
> - kvm_queued_exception (whole struct content)
> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> 
> Unclear points (for me) from the last discussion:
> 
> - sipi_vector
> - MCE (covered via kvm_queued_exception, or does it require more?)
> 
> Please extend or correct the list as required.
> 

Here is a wrap-up of what has been reported so far:

 - NMI
o nmi_masked
o nmi_pending
o nmi_injected
 - queued exception
o kvm_queued_exception
o triple_fault
 - SVM
o gif
(Are we sure that there is really nothing more here?)
 - sipi_vector

So the next question is how to map these on substates. I'm currently
leaning towards this organization:

 - KVM_X86_VCPU_STATE_EVENTS
o NMI states
o pending exception
o sipi_vector
o pending interrupt?
  (would be redundant to kvm_sregs.interrupt_bitmap, but that struct
  may be obsoleted one day)
 - KVM_X86_VCPU_STATE_SVM
o gif

Any concerns or better suggestions?

Jan



signature.asc
Description: OpenPGP digital signature


Re: List of unaccessible x86 states

2009-10-23 Thread Jan Kiszka
Marcelo Tosatti wrote:
> On Fri, Oct 23, 2009 at 03:08:21PM +0200, Jan Kiszka wrote:
>> Marcelo Tosatti wrote:
>>> On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
 Hi all,

 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.

 What I read so far (or tried to patch already):

 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

 Unclear points (for me) from the last discussion:

 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)
>>> Should save/restore the MCE MSRs (its contents are currently
>>> lost/overwritten AFAICS).
>>>
>>> MTRR contents are also dropped.
>> Hmm, the code path is winding, but aren't they already available to user
>> space via GET/SET_MSRS?
> 
> Yes, nevermind, irrelevant to the current discussion.
> 

Oh, then I misunderstood your original reply as "we need to add them to
the list as well". Even better.

Jan



signature.asc
Description: OpenPGP digital signature


Re: List of unaccessible x86 states

2009-10-23 Thread Marcelo Tosatti
On Fri, Oct 23, 2009 at 03:08:21PM +0200, Jan Kiszka wrote:
> Marcelo Tosatti wrote:
> > On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
> >> Hi all,
> >>
> >> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
> >> this is an attempt to collect the precise requirements for additional
> >> state fields. Once everyone feels the list is complete, we can decide
> >> how to partition it into one ore more substates for the new
> >> KVM_GET/SET_VCPU_STATE interface.
> >>
> >> What I read so far (or tried to patch already):
> >>
> >> - nmi_masked
> >> - nmi_pending
> >> - nmi_injected
> >> - kvm_queued_exception (whole struct content)
> >> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >>
> >> Unclear points (for me) from the last discussion:
> >>
> >> - sipi_vector
> >> - MCE (covered via kvm_queued_exception, or does it require more?)
> > 
> > Should save/restore the MCE MSRs (its contents are currently
> > lost/overwritten AFAICS).
> > 
> > MTRR contents are also dropped.
> 
> Hmm, the code path is winding, but aren't they already available to user
> space via GET/SET_MSRS?

Yes, nevermind, irrelevant to the current discussion.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-23 Thread Jan Kiszka
Marcelo Tosatti wrote:
> On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
>> Hi all,
>>
>> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
>> this is an attempt to collect the precise requirements for additional
>> state fields. Once everyone feels the list is complete, we can decide
>> how to partition it into one ore more substates for the new
>> KVM_GET/SET_VCPU_STATE interface.
>>
>> What I read so far (or tried to patch already):
>>
>> - nmi_masked
>> - nmi_pending
>> - nmi_injected
>> - kvm_queued_exception (whole struct content)
>> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
>>
>> Unclear points (for me) from the last discussion:
>>
>> - sipi_vector
>> - MCE (covered via kvm_queued_exception, or does it require more?)
> 
> Should save/restore the MCE MSRs (its contents are currently
> lost/overwritten AFAICS).
> 
> MTRR contents are also dropped.

Hmm, the code path is winding, but aren't they already available to user
space via GET/SET_MSRS?

Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 09:23:22PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 21:09, Gleb Natapov wrote:
> 
> >On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:
> >>
> >>On 20.10.2009, at 20:55, Gleb Natapov wrote:
> >>
> >>>On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 15:48, Gleb Natapov wrote:
> 
> >On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
> >>
> >>On 20.10.2009, at 15:37, Jan Kiszka wrote:
> >>
> >>>Alexander Graf wrote:
> On 20.10.2009, at 15:01, Jan Kiszka wrote:
> 
> >Hi all,
> >
> >as the list of yet user-unaccessible x86 states is a bit
> >volatile ATM,
> >this is an attempt to collect the precise requirements for
> >additional
> >state fields. Once everyone feels the list is complete, we can
> >decide
> >how to partition it into one ore more substates for the new
> >KVM_GET/SET_VCPU_STATE interface.
> >
> >What I read so far (or tried to patch already):
> >
> >- nmi_masked
> >- nmi_pending
> >- nmi_injected
> >- kvm_queued_exception (whole struct content)
> >- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >
> >Unclear points (for me) from the last discussion:
> >
> >- sipi_vector
> >- MCE (covered via kvm_queued_exception, or does it
> >require more?)
> >
> >Please extend or correct the list as required.
> 
> hflags. Qemu supports GIF, kvm supports GIF, but no side
> knows how to
> sync it.
> >>>
> >>>BTW, GIF is related to svm nesting, right?
> >>
> >>Yes and no. It's an architecture addition that came with
> >>SVM, yes.
> >>
> >>The problem is that I don't want to support migrating while in a
> >Why not?
> 
> Because then we'd have to transfer the whole host cpu cache and the
> merged intercept bitmaps to userspace as well. That's just too many
> internals to expose IMHO.
> 
> >>>But the amount of information is constant no matter how l2
> >>>guest there
> >>>are. Correct? We can expose it as separate substate.
> >>
> >>Or we can just not migrate while in a nested guest :-). Which will
> >>make everything a lot easier.
> >>
> >Suppose we have a l2 guest that handles interrupt/nmis by itself
> >how can we
> >force it to exit?
> 
> If the nested hypervisor doesn't intercept INTR we don't support it
> anyways.
> 
Why? I looked at the code briefly and it looks like we just inject
interrupt as usual instead of do nested exit if l2 does not intercept
INTR. Have I miss interpreted the code. Even if I have why not support
it?

> >I don't think requesting certain cpu state before
> >migration is the right thing to do. What if user paused a VM and then
> >decided to migrate?
> 
> So pausing has to make it go out of nested guest context too?
Probably.

> Then we're not in the nested guest context, right? :)
> 
> >Or VM was paused automatically because of shortage
> >of disk space and management want to migrate VM to other host with
> >bigger disk?
> 
> Same as before.
What do you mean?

> 
> 
> Really, pushing the whole nesting state over is not a good idea.
> 
May be just disallow migration with nested guest running then? Cross
vendor migration is not possible anyway.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 21:09, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:


On 20.10.2009, at 20:55, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it
require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side
knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM,  
yes.


The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the
merged intercept bitmaps to userspace as well. That's just too many
internals to expose IMHO.

But the amount of information is constant no matter how l2 guest  
there

are. Correct? We can expose it as separate substate.


Or we can just not migrate while in a nested guest :-). Which will
make everything a lot easier.

Suppose we have a l2 guest that handles interrupt/nmis by itself how  
can we

force it to exit?


If the nested hypervisor doesn't intercept INTR we don't support it  
anyways.



I don't think requesting certain cpu state before
migration is the right thing to do. What if user paused a VM and then
decided to migrate?


So pausing has to make it go out of nested guest context too?
Then we're not in the nested guest context, right? :)


Or VM was paused automatically because of shortage
of disk space and management want to migrate VM to other host with
bigger disk?


Same as before.


Really, pushing the whole nesting state over is not a good idea.

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 20:55, Gleb Natapov wrote:
> 
> >On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
> >>
> >>On 20.10.2009, at 15:48, Gleb Natapov wrote:
> >>
> >>>On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 15:37, Jan Kiszka wrote:
> 
> >Alexander Graf wrote:
> >>On 20.10.2009, at 15:01, Jan Kiszka wrote:
> >>
> >>>Hi all,
> >>>
> >>>as the list of yet user-unaccessible x86 states is a bit
> >>>volatile ATM,
> >>>this is an attempt to collect the precise requirements for
> >>>additional
> >>>state fields. Once everyone feels the list is complete, we can
> >>>decide
> >>>how to partition it into one ore more substates for the new
> >>>KVM_GET/SET_VCPU_STATE interface.
> >>>
> >>>What I read so far (or tried to patch already):
> >>>
> >>>- nmi_masked
> >>>- nmi_pending
> >>>- nmi_injected
> >>>- kvm_queued_exception (whole struct content)
> >>>- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >>>
> >>>Unclear points (for me) from the last discussion:
> >>>
> >>>- sipi_vector
> >>>- MCE (covered via kvm_queued_exception, or does it
> >>>require more?)
> >>>
> >>>Please extend or correct the list as required.
> >>
> >>hflags. Qemu supports GIF, kvm supports GIF, but no side
> >>knows how to
> >>sync it.
> >
> >BTW, GIF is related to svm nesting, right?
> 
> Yes and no. It's an architecture addition that came with SVM, yes.
> 
> The problem is that I don't want to support migrating while in a
> >>>Why not?
> >>
> >>Because then we'd have to transfer the whole host cpu cache and the
> >>merged intercept bitmaps to userspace as well. That's just too many
> >>internals to expose IMHO.
> >>
> >But the amount of information is constant no matter how l2 guest there
> >are. Correct? We can expose it as separate substate.
> 
> Or we can just not migrate while in a nested guest :-). Which will
> make everything a lot easier.
> 
Suppose we have a l2 guest that handles interrupt/nmis by itself how can we
force it to exit? I don't think requesting certain cpu state before
migration is the right thing to do. What if user paused a VM and then
decided to migrate? Or VM was paused automatically because of shortage
of disk space and management want to migrate VM to other host with
bigger disk?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 20:55, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require  
more?)


Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side
knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the
merged intercept bitmaps to userspace as well. That's just too many
internals to expose IMHO.


But the amount of information is constant no matter how l2 guest there
are. Correct? We can expose it as separate substate.


Or we can just not migrate while in a nested guest :-). Which will  
make everything a lot easier.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 15:48, Gleb Natapov wrote:
> 
> >On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
> >>
> >>On 20.10.2009, at 15:37, Jan Kiszka wrote:
> >>
> >>>Alexander Graf wrote:
> On 20.10.2009, at 15:01, Jan Kiszka wrote:
> 
> >Hi all,
> >
> >as the list of yet user-unaccessible x86 states is a bit
> >volatile ATM,
> >this is an attempt to collect the precise requirements for
> >additional
> >state fields. Once everyone feels the list is complete, we can
> >decide
> >how to partition it into one ore more substates for the new
> >KVM_GET/SET_VCPU_STATE interface.
> >
> >What I read so far (or tried to patch already):
> >
> >- nmi_masked
> >- nmi_pending
> >- nmi_injected
> >- kvm_queued_exception (whole struct content)
> >- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >
> >Unclear points (for me) from the last discussion:
> >
> >- sipi_vector
> >- MCE (covered via kvm_queued_exception, or does it require more?)
> >
> >Please extend or correct the list as required.
> 
> hflags. Qemu supports GIF, kvm supports GIF, but no side
> knows how to
> sync it.
> >>>
> >>>BTW, GIF is related to svm nesting, right?
> >>
> >>Yes and no. It's an architecture addition that came with SVM, yes.
> >>
> >>The problem is that I don't want to support migrating while in a
> >Why not?
> 
> Because then we'd have to transfer the whole host cpu cache and the
> merged intercept bitmaps to userspace as well. That's just too many
> internals to expose IMHO.
> 
But the amount of information is constant no matter how l2 guest there
are. Correct? We can expose it as separate substate.

> >>nested VM. We can just #VMEXIT just before migrating with a
> >>VMEXIT_INTR intercept.
> >>
> >We don't notify kernel about migration currently. CPU state is
> >migrated
> >when VM is already paused, how we can exit nested guest at this point?
> 
> Hm - introduce a new ioctl? I haven't fully thought it through yet :-).
> 
There is not software problem that can't be solved by introducing new
ioctl :)

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
> Hi all,
> 
> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
> this is an attempt to collect the precise requirements for additional
> state fields. Once everyone feels the list is complete, we can decide
> how to partition it into one ore more substates for the new
> KVM_GET/SET_VCPU_STATE interface.
> 
> What I read so far (or tried to patch already):
> 
> - nmi_masked
> - nmi_pending
> - nmi_injected
> - kvm_queued_exception (whole struct content)
> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> 
> Unclear points (for me) from the last discussion:
> 
> - sipi_vector
> - MCE (covered via kvm_queued_exception, or does it require more?)

Should save/restore the MCE MSRs (its contents are currently
lost/overwritten AFAICS).

MTRR contents are also dropped.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows  
how to

sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the  
merged intercept bitmaps to userspace as well. That's just too many  
internals to expose IMHO.



nested VM. We can just #VMEXIT just before migrating with a
VMEXIT_INTR intercept.

We don't notify kernel about migration currently. CPU state is  
migrated

when VM is already paused, how we can exit nested guest at this point?


Hm - introduce a new ioctl? I haven't fully thought it through yet :-).

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
> 
> On 20.10.2009, at 15:37, Jan Kiszka wrote:
> 
> >Alexander Graf wrote:
> >>On 20.10.2009, at 15:01, Jan Kiszka wrote:
> >>
> >>>Hi all,
> >>>
> >>>as the list of yet user-unaccessible x86 states is a bit
> >>>volatile ATM,
> >>>this is an attempt to collect the precise requirements for
> >>>additional
> >>>state fields. Once everyone feels the list is complete, we can
> >>>decide
> >>>how to partition it into one ore more substates for the new
> >>>KVM_GET/SET_VCPU_STATE interface.
> >>>
> >>>What I read so far (or tried to patch already):
> >>>
> >>>- nmi_masked
> >>>- nmi_pending
> >>>- nmi_injected
> >>>- kvm_queued_exception (whole struct content)
> >>>- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >>>
> >>>Unclear points (for me) from the last discussion:
> >>>
> >>>- sipi_vector
> >>>- MCE (covered via kvm_queued_exception, or does it require more?)
> >>>
> >>>Please extend or correct the list as required.
> >>
> >>hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
> >>sync it.
> >
> >BTW, GIF is related to svm nesting, right?
> 
> Yes and no. It's an architecture addition that came with SVM, yes.
> 
> The problem is that I don't want to support migrating while in a
Why not?

> nested VM. We can just #VMEXIT just before migrating with a
> VMEXIT_INTR intercept.
> 
We don't notify kernel about migration currently. CPU state is migrated
when VM is already paused, how we can exit nested guest at this point?

> Now just after #VMEXIT we're in a state that's pure host context,
> but has GIF=0. So we need to know about that in userspace to support
> migration.
> 
> Alex

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile  
ATM,
this is an attempt to collect the precise requirements for  
additional
state fields. Once everyone feels the list is complete, we can  
decide

how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a  
nested VM. We can just #VMEXIT just before migrating with a  
VMEXIT_INTR intercept.


Now just after #VMEXIT we're in a state that's pure host context, but  
has GIF=0. So we need to know about that in userspace to support  
migration.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Alexander Graf wrote:
> On 20.10.2009, at 15:01, Jan Kiszka wrote:
> 
>> Hi all,
>>
>> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
>> this is an attempt to collect the precise requirements for additional
>> state fields. Once everyone feels the list is complete, we can decide
>> how to partition it into one ore more substates for the new
>> KVM_GET/SET_VCPU_STATE interface.
>>
>> What I read so far (or tried to patch already):
>>
>> - nmi_masked
>> - nmi_pending
>> - nmi_injected
>> - kvm_queued_exception (whole struct content)
>> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
>>
>> Unclear points (for me) from the last discussion:
>>
>> - sipi_vector
>> - MCE (covered via kvm_queued_exception, or does it require more?)
>>
>> Please extend or correct the list as required.
> 
> hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
> sync it.

BTW, GIF is related to svm nesting, right?

Orit, are there any additional states arriving on the vmx side as well
with your nesting patches?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
> Hi all,
> 
> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
> this is an attempt to collect the precise requirements for additional
> state fields. Once everyone feels the list is complete, we can decide
> how to partition it into one ore more substates for the new
> KVM_GET/SET_VCPU_STATE interface.
> 
> What I read so far (or tried to patch already):
> 
> - nmi_masked
> - nmi_pending
> - nmi_injected
> - kvm_queued_exception (whole struct content)
> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> 
> Unclear points (for me) from the last discussion:
> 
> - sipi_vector
Should be migrated.

> - MCE (covered via kvm_queued_exception, or does it require more?)
> 
> Please extend or correct the list as required.
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT SE 2
> Corporate Competence Center Embedded Linux

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:29:38PM +0200, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
> >> Alexander Graf wrote:
> >>> On 20.10.2009, at 15:01, Jan Kiszka wrote:
> >>>
>  Hi all,
> 
>  as the list of yet user-unaccessible x86 states is a bit volatile ATM,
>  this is an attempt to collect the precise requirements for additional
>  state fields. Once everyone feels the list is complete, we can decide
>  how to partition it into one ore more substates for the new
>  KVM_GET/SET_VCPU_STATE interface.
> 
>  What I read so far (or tried to patch already):
> 
>  - nmi_masked
>  - nmi_pending
>  - nmi_injected
>  - kvm_queued_exception (whole struct content)
>  - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> 
>  Unclear points (for me) from the last discussion:
> 
>  - sipi_vector
>  - MCE (covered via kvm_queued_exception, or does it require more?)
> 
>  Please extend or correct the list as required.
> >>> hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
> >>> sync it.
> >> OK. Whole hflags or just the GIF bit?
> >>
> >> If we allow access to all bits, can user space cause any problems
> >> (beyond screwing up its guests) by passing weird patterns?
> >>
> > HF_NMI_MASK should be migrated too. Destination should enable IRET 
> > intercept if
> > HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
> > will never happen :)
> 
> HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would
> therefore likely be masked out.
> 
Correct. We can restore HF_NMI_MASK from nmi_masked.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Gleb Natapov wrote:
> On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
>> Alexander Graf wrote:
>>> On 20.10.2009, at 15:01, Jan Kiszka wrote:
>>>
 Hi all,

 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.

 What I read so far (or tried to patch already):

 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

 Unclear points (for me) from the last discussion:

 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)

 Please extend or correct the list as required.
>>> hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
>>> sync it.
>> OK. Whole hflags or just the GIF bit?
>>
>> If we allow access to all bits, can user space cause any problems
>> (beyond screwing up its guests) by passing weird patterns?
>>
> HF_NMI_MASK should be migrated too. Destination should enable IRET intercept 
> if
> HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
> will never happen :)

HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would
therefore likely be masked out.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:19, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile  
ATM,
this is an attempt to collect the precise requirements for  
additional
state fields. Once everyone feels the list is complete, we can  
decide

how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
sync it.


OK. Whole hflags or just the GIF bit?


ag...@busu:~/git/kvm> grep -R HF_ arch/x86/include/asm/*kvm*
arch/x86/include/asm/kvm_host.h:#define HF_GIF_MASK (1 << 0)
arch/x86/include/asm/kvm_host.h:#define HF_HIF_MASK (1 << 1)
arch/x86/include/asm/kvm_host.h:#define HF_VINTR_MASK   (1 << 2)
arch/x86/include/asm/kvm_host.h:#define HF_NMI_MASK (1 << 3)
arch/x86/include/asm/kvm_host.h:#define HF_IRET_MASK(1 << 4)

I can only talk for GIF here and that should be fine. Not knowing  
about the others does seem like we could get race conditions though.



If we allow access to all bits, can user space cause any problems
(beyond screwing up its guests) by passing weird patterns?


IMHO the hflags should be converted between userspace and kernel  
representation. There's a good chance we run older userspace that  
doesn't know about certain flags yet and I'd like to keep the bits as  
flexible as possible.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
> Alexander Graf wrote:
> > On 20.10.2009, at 15:01, Jan Kiszka wrote:
> > 
> >> Hi all,
> >>
> >> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
> >> this is an attempt to collect the precise requirements for additional
> >> state fields. Once everyone feels the list is complete, we can decide
> >> how to partition it into one ore more substates for the new
> >> KVM_GET/SET_VCPU_STATE interface.
> >>
> >> What I read so far (or tried to patch already):
> >>
> >> - nmi_masked
> >> - nmi_pending
> >> - nmi_injected
> >> - kvm_queued_exception (whole struct content)
> >> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
> >>
> >> Unclear points (for me) from the last discussion:
> >>
> >> - sipi_vector
> >> - MCE (covered via kvm_queued_exception, or does it require more?)
> >>
> >> Please extend or correct the list as required.
> > 
> > hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
> > sync it.
> 
> OK. Whole hflags or just the GIF bit?
> 
> If we allow access to all bits, can user space cause any problems
> (beyond screwing up its guests) by passing weird patterns?
> 
HF_NMI_MASK should be migrated too. Destination should enable IRET intercept if
HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
will never happen :)

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Alexander Graf wrote:
> On 20.10.2009, at 15:01, Jan Kiszka wrote:
> 
>> Hi all,
>>
>> as the list of yet user-unaccessible x86 states is a bit volatile ATM,
>> this is an attempt to collect the precise requirements for additional
>> state fields. Once everyone feels the list is complete, we can decide
>> how to partition it into one ore more substates for the new
>> KVM_GET/SET_VCPU_STATE interface.
>>
>> What I read so far (or tried to patch already):
>>
>> - nmi_masked
>> - nmi_pending
>> - nmi_injected
>> - kvm_queued_exception (whole struct content)
>> - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
>>
>> Unclear points (for me) from the last discussion:
>>
>> - sipi_vector
>> - MCE (covered via kvm_queued_exception, or does it require more?)
>>
>> Please extend or correct the list as required.
> 
> hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
> sync it.

OK. Whole hflags or just the GIF bit?

If we allow access to all bits, can user space cause any problems
(beyond screwing up its guests) by passing weird patterns?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile ATM,
this is an attempt to collect the precise requirements for additional
state fields. Once everyone feels the list is complete, we can decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
sync it.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html