Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Christoffer Dall
On Sat, Mar 22, 2014 at 08:19:52PM -0700, Christoffer Dall wrote:
> On Sat, Mar 22, 2014 at 09:08:37AM +0100, Paolo Bonzini wrote:
> > Il 22/03/2014 03:29, Christoffer Dall ha scritto:
> > >1. Simply mandate that VM implementations support persistent variables
> > >   for their UEFI implementation - with whatever constraints that may
> > >   put on higher level tools.
> > >
> > >2. Require that OSes shipped as part of compliant VM images make no
> > >   assumption that changes to the UEFI environment will be stored.
> > >
> > >I feel that option number two will break in all sorts of cases, just
> > >like Grant stated above, and it is fundamentally not practical; if a
> > >distribution ships Linux with a UEFI stub that expects to be able to do
> > >something, distributions must modify Linux to conform to this spec.  I
> > >think imagining that this spec controls how UEFI support in Linux/Grub
> > >is done in general would be overreaching.  Additionally, Michael brought
> > >up the fact that it would be non-UEFI compliant.
> > 
> > OSes are already able to cope with loss of changes to UEFI
> > environment are stored, because losing persistent variables is what
> > happens if you copy an image to a new hard disk.
> > 
> > Asking implementations for support of persistent variables is a good
> > idea; however, independent of what is in the spec, OSes should not
> > expect that users will enable that support---most of them won't.
> > 
> OK, fair enough, mandating support for persistent variable storage may
> be overreaching but at the same time I feel it is unlikely for this spec
> to reach far enough that a generic UEFI Linux loader, for example,
> actually follows it, and therefore explicitly mandating that guest OSes
> must be completely portable in all that they do is a non-practical
> constraint.  That was the request that kicked this discussion off, and
> what I was trygint to address.
> 
After thinking about this a bit more, I think I see what we're actually
discussing.  It's obvious that if software in a VM makes changes to UEFI
variables that are required to be persistent for that VM image to boot
again, then the VM image is no longer portable, as per the spec.

It is in fact probably a good idea to specify that out clearly, I found
it intuitively obvious, but it could be spelled out nevertheless.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Christoffer Dall
On Sat, Mar 22, 2014 at 12:23:54PM +, Grant Likely wrote:
> On Fri, 21 Mar 2014 19:29:50 -0700, Christoffer Dall 
>  wrote:
> > On Fri, Mar 07, 2014 at 08:24:18PM +0800, Grant Likely wrote:
> > > On Thu, 6 Mar 2014 12:04:50 +, Robie Basak 
> > >  wrote:
> > > > On Thu, Mar 06, 2014 at 12:44:57PM +0100, Laszlo Ersek wrote:
> > > > > If I understand correctly, the question is this:
> > > > > 
> > > > >   Given a hypervisor that doesn't support non-volatile UEFI variables
> > > > >   (including BootOrder and Boot), is it possible to automatically
> > > > >   boot a carefully prepared VM image, made available as a fixed media
> > > > >   device?
> > > > > 
> > > > > The answer is "yes". See
> > > > 
> > > > Right, but I think there is a subsequent problem.
> > > > 
> > > > >   It is expected that this default boot will load an operating system 
> > > > > or
> > > > >   a maintenance utility. If this is an operating system setup program 
> > > > > it
> > > > >   is then responsible for setting the requisite environment variables
> > > > >   for subsequent boots. The platform firmware may also decide to 
> > > > > recover
> > > > 
> > > > >   or set to a known set of boot options.
> > > > 
> > > > It seems to me that the guest OS is permitted to assume that persistent
> > > > boot variables will work after first boot, for subsequent boots.
> > > > 
> > > > So, for example, the guest OS might, on bootloader or kernel upgrade,
> > > > completely replace the boot mechanism, dropping the removable path and
> > > > replacing it with a fixed disk arrangement, setting boot variables
> > > > appropriately, and assume that it can reboot and everything will
> > > > continue to work.
> > > > 
> > > > But if the host does not support non-volatile variables, then this will
> > > > break.
> > > 
> > > Correct
> > > 
> > > > This is why I'm suggesting that the specification mandate that the guest
> > > > OS shipped in a "portable disk image" as defined by the spec must not
> > > > make this assumption.
> > > 
> > > Also correct... the installer must be aware of this constraint which is
> > > why it is part of the draft spec.
> > > 
> > > > It's either this, or mandate that all hosts must support persistent
> > > > variables. I have no objection to that in principle, but since we have
> > > > no implementation currently, it seems easier to avoid this particular
> > > > roadblock by tweaking the spec in a way that nobody seems to care about
> > > > anyway.
> > > 
> > > Right. I guess my position is that if persistent storage is not
> > > implemented then there are a number of install/upgrade scenarios that
> > > won't work. Regardless, portable images must assume an empty boot list
> > > and we can build that into the spec.
> > > 
> > 
> > Sorry for the delay in responding - all sorts of unexpected things
> > happened when I returned from LCA14.
> > 
> > I agree on the technical discussion going on here.  My conclusion is
> > that we have two options:
> > 
> > 1. Simply mandate that VM implementations support persistent variables
> >for their UEFI implementation - with whatever constraints that may
> >put on higher level tools.
> > 
> > 2. Require that OSes shipped as part of compliant VM images make no
> >assumption that changes to the UEFI environment will be stored.
> > 
> > I feel that option number two will break in all sorts of cases, just
> > like Grant stated above, and it is fundamentally not practical; if a
> > distribution ships Linux with a UEFI stub that expects to be able to do
> > something, distributions must modify Linux to conform to this spec.  I
> > think imagining that this spec controls how UEFI support in Linux/Grub
> > is done in general would be overreaching.  Additionally, Michael brought
> > up the fact that it would be non-UEFI compliant.
> 
> That isn't actually my position. I absolutely think that VMs /should/
> implement persistent variables, but the variables are a property of a VM
> instance, not of the disk image. As far as this spec is concerned, I
> think portable disk images should operate under the assumption of an
> empty set of variables, and therefore follow the removable disk
> requirements in the UEFI spec.

I think we may have misunderstood each other a bit here, I didn't mean
anything in my statement above that contradicts what you say.

I am only saying that mandating what OSes do and don't do, once they've
been booted, is beyond the scope of this spec.

Portable VM images should absolutely boot with an empty set of variables
and I completely agree that the persistent variable storage is a
property of the VM.

> 
> I would propose a modification to option 1:
> 
> VM implementations SHOULD to implement persistent variables for
> their UEFI implementation - with whatever constraints that may be put on
> higher level tools. Variable storage SHALL be a property of a VM
> instance, but SHALL NOT be stored as part of a portable disk image.
> 

Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Christoffer Dall
On Sat, Mar 22, 2014 at 09:08:37AM +0100, Paolo Bonzini wrote:
> Il 22/03/2014 03:29, Christoffer Dall ha scritto:
> >1. Simply mandate that VM implementations support persistent variables
> >   for their UEFI implementation - with whatever constraints that may
> >   put on higher level tools.
> >
> >2. Require that OSes shipped as part of compliant VM images make no
> >   assumption that changes to the UEFI environment will be stored.
> >
> >I feel that option number two will break in all sorts of cases, just
> >like Grant stated above, and it is fundamentally not practical; if a
> >distribution ships Linux with a UEFI stub that expects to be able to do
> >something, distributions must modify Linux to conform to this spec.  I
> >think imagining that this spec controls how UEFI support in Linux/Grub
> >is done in general would be overreaching.  Additionally, Michael brought
> >up the fact that it would be non-UEFI compliant.
> 
> OSes are already able to cope with loss of changes to UEFI
> environment are stored, because losing persistent variables is what
> happens if you copy an image to a new hard disk.
> 
> Asking implementations for support of persistent variables is a good
> idea; however, independent of what is in the spec, OSes should not
> expect that users will enable that support---most of them won't.
> 
OK, fair enough, mandating support for persistent variable storage may
be overreaching but at the same time I feel it is unlikely for this spec
to reach far enough that a generic UEFI Linux loader, for example,
actually follows it, and therefore explicitly mandating that guest OSes
must be completely portable in all that they do is a non-practical
constraint.  That was the request that kicked this discussion off, and
what I was trygint to address.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Laszlo Ersek
On 03/23/14 00:38, Michael Casadevall wrote:
> On 03/22/2014 03:57 PM, Paolo Bonzini wrote:
>> Il 22/03/2014 13:23, Grant Likely ha scritto:
>>> VM implementations SHOULD to implement persistent variables for 
>>> their UEFI implementation - with whatever constraints that may be
>>> put on higher level tools. Variable storage SHALL be a property
>>> of a VM instance, but SHALL NOT be stored as part of a portable
>>> disk image. Portable disk images SHALL conform to the UEFI
>>> removable disk requirements from the UEFI spec.
>>
>> I fully agree with this wording.
>>
>> Paolo
> 
> +1 here

I cannot resist getting productive here, so I'll mention that the first
"to" in the language should probably be dropped.

I would also replace the "but" with "and".

Happy to contribute, always.
Laszlo
:)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Michael Casadevall
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



On 03/22/2014 03:57 PM, Paolo Bonzini wrote:
> Il 22/03/2014 13:23, Grant Likely ha scritto:
>> VM implementations SHOULD to implement persistent variables for 
>> their UEFI implementation - with whatever constraints that may be
>> put on higher level tools. Variable storage SHALL be a property
>> of a VM instance, but SHALL NOT be stored as part of a portable
>> disk image. Portable disk images SHALL conform to the UEFI
>> removable disk requirements from the UEFI spec.
> 
> I fully agree with this wording.
> 
> Paolo

+1 here

Michael
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTLh75AAoJEHM+GkLSJHY5XfwP/27RPv5+UQKgNJa0kooLEGqD
L5YkcofZOPeAdWZ0gzN9BDhzQ+4un5HhaJKTldwgKogblLmaiN9LMi/affSleF+N
HMqSjbxCUy/KduD8Ib4k7qZDkGuO3PbVGP/1RpcaE5jZ+EvQYHGqwRpzWYZyx7Ve
r4DQhSc5IyR8titm/hiCdFufDznQ+afrxuum8WQIC02H3vPYavDmVaApXrgYwnHA
vfSOjEKI4xfmBfWfrtVBFr8/2ZgP9mSuj9lYukun2nRG5Onk7fsPQSZuLCtk52x9
Hx6/hg5hskLM0cTfQ9yCyW3lUdQJkiLmmtn43EfKTJ80esrWpCzhkVD6b4qsYokd
7CpBHziupDTrjQTGundWOkw07pzsrlq3jOQ5tm4hmsr9qwtq4W2SOFfkM2q4KpfO
j0z2edqRoT32v0eAOhBvkC4cx9rfBVZR/sU6HmKqLM80zEMNcXmMfXgwZPqznlfs
bnT342tJPdQD+Mt215MbgKhxwq3kMAWsjW8/2Ad896lyhk6nU0ZbSIc4A8+HBU3M
Bn7tBmpRzuyExsjFxzkhLMfIAa3/UxQuI7o1WXD5asWHxQTKtPhOYTCyhNq4tBHo
8LU9dcP5Q1vZsAFa28MNjrNECf3hrLPxj06NI13CT6PST67FaScMqsV1ZT8Q9xm+
Cw4V5inPq5BvMpdmIu4S
=uG8J
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Grant Likely
On Sat, 22 Mar 2014 20:57:58 +0100, Paolo Bonzini  wrote:
> Il 22/03/2014 13:23, Grant Likely ha scritto:
> > VM implementations SHOULD to implement persistent variables for
> > their UEFI implementation - with whatever constraints that may be put on
> > higher level tools. Variable storage SHALL be a property of a VM
> > instance, but SHALL NOT be stored as part of a portable disk image.
> > Portable disk images SHALL conform to the UEFI removable disk
> > requirements from the UEFI spec.
> 
> I fully agree with this wording.

:-)

On a seperate, but related, topic. If we were to define a portable VM
spec that includes the entire hardware configuration, then variables
storage should be included in that.

g.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread Peter Wu
On Saturday 22 March 2014 14:27:59 Gleb Natapov wrote:
> > but now I have a NULL dereference (in rapl_pmu_init). Previously, when
> > `-cpu SandyBridge` was passed to qemu, it would show this:
> > 
> > [0.016995] Performance Events: unsupported p6 CPU model 42 no PMU 
> > driver, software events only.
> > 
> > The same NULL pointer deref would be visible (slightly different
> > addresses, but the Code lines are equal). With `-host`, the NULL deref
> > with `-cpu host` contains:
> > 
> > [0.016445] Performance Events: 16-deep LBR, IvyBridge events, Intel 
> > PMU driver.
> > 
> > Full dmesg below.
> > 
> I am confused. Do you see crash now with -cpu SandyBridge and -cpu host, or 
> -cpu host only?

The RAPL crash is seen with both SandyBridge and host, I mentioned SB
because that environment should be more constant than "host" (which
depends on the CPU you have on, well, the host).

Peter

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread H. Peter Anvin
Using _safe has it's own issues if noone checks the errors.

On March 22, 2014 5:27:59 AM PDT, Gleb Natapov  wrote:
>On Sat, Mar 22, 2014 at 11:05:03AM +0100, Peter Wu wrote:
>> On Saturday 22 March 2014 10:50:45 Gleb Natapov wrote:
>> > On Fri, Mar 21, 2014 at 12:04:32PM -0700, Venkatesh Srinivas wrote:
>> > > On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu 
>wrote:
>> > [skip]
>> > 
>> > > When -cpu host is used, qemu/kvm passed the host CPUID F/M/S to
>the
>> > > guest. intel_pmu_cpu_*() -> intel_pmu_lbr_reset() uses rdmsr() /
>> > > wrmsr(), rather than the safe variants; if KVM does not support
>the
>> > > particular MSRs in question, you will see a #GP(0) there. See
>> > > https://lkml.org/lkml/2014/3/13/453 for a similar bug other PMU
>code.
>> > > 
>> > When kernel is compiled with guest support all rdmsr()/wrmsr()
>become _safe(),
>> > so the question for Peter is if his guest kernel has guest support
>enabled?
>> 
>> Linux guest support (CONFIG_HYPERVISOR_GUEST) was not enabled, see
>> .config in the first mail[1]. Enabling that option does not change
>the
>> situation.
>> 
>> With CONFIG_PARAVIRT and CONFIG_KVM_GUEST enabled, the PMU GPF is
>gone,
>Yeah, it should be PARAVIRT indeed since rdmsr()/wrmsr() is substituted
>by _safe()
>using paravirt calls.
>
>> but now I have a NULL dereference (in rapl_pmu_init). Previously,
>when
>> `-cpu SandyBridge` was passed to qemu, it would show this:
>> 
>> [0.016995] Performance Events: unsupported p6 CPU model 42 no
>PMU driver, software events only.
>> 
>> The same NULL pointer deref would be visible (slightly different
>> addresses, but the Code lines are equal). With `-host`, the NULL
>deref
>> with `-cpu host` contains:
>> 
>> [0.016445] Performance Events: 16-deep LBR, IvyBridge events,
>Intel PMU driver.
>> 
>> Full dmesg below.
>> 
>I am confused. Do you see crash now with -cpu SandyBridge and -cpu
>host, or -cpu host only?
>
>--
>   Gleb.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Paolo Bonzini

Il 22/03/2014 13:23, Grant Likely ha scritto:

VM implementations SHOULD to implement persistent variables for
their UEFI implementation - with whatever constraints that may be put on
higher level tools. Variable storage SHALL be a property of a VM
instance, but SHALL NOT be stored as part of a portable disk image.
Portable disk images SHALL conform to the UEFI removable disk
requirements from the UEFI spec.


I fully agree with this wording.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread Gleb Natapov
On Sat, Mar 22, 2014 at 11:05:03AM +0100, Peter Wu wrote:
> On Saturday 22 March 2014 10:50:45 Gleb Natapov wrote:
> > On Fri, Mar 21, 2014 at 12:04:32PM -0700, Venkatesh Srinivas wrote:
> > > On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu  wrote:
> > [skip]
> > 
> > > When -cpu host is used, qemu/kvm passed the host CPUID F/M/S to the
> > > guest. intel_pmu_cpu_*() -> intel_pmu_lbr_reset() uses rdmsr() /
> > > wrmsr(), rather than the safe variants; if KVM does not support the
> > > particular MSRs in question, you will see a #GP(0) there. See
> > > https://lkml.org/lkml/2014/3/13/453 for a similar bug other PMU code.
> > > 
> > When kernel is compiled with guest support all rdmsr()/wrmsr() become 
> > _safe(),
> > so the question for Peter is if his guest kernel has guest support enabled?
> 
> Linux guest support (CONFIG_HYPERVISOR_GUEST) was not enabled, see
> .config in the first mail[1]. Enabling that option does not change the
> situation.
> 
> With CONFIG_PARAVIRT and CONFIG_KVM_GUEST enabled, the PMU GPF is gone,
Yeah, it should be PARAVIRT indeed since rdmsr()/wrmsr() is substituted by 
_safe()
using paravirt calls.

> but now I have a NULL dereference (in rapl_pmu_init). Previously, when
> `-cpu SandyBridge` was passed to qemu, it would show this:
> 
> [0.016995] Performance Events: unsupported p6 CPU model 42 no PMU 
> driver, software events only.
> 
> The same NULL pointer deref would be visible (slightly different
> addresses, but the Code lines are equal). With `-host`, the NULL deref
> with `-cpu host` contains:
> 
> [0.016445] Performance Events: 16-deep LBR, IvyBridge events, Intel 
> PMU driver.
> 
> Full dmesg below.
> 
I am confused. Do you see crash now with -cpu SandyBridge and -cpu host, or 
-cpu host only?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Where to file this bug, please?

2014-03-22 Thread Kashyap Chamarthy
On Sat, Mar 22, 2014 at 08:07:42AM +, Prof. Dr. Michael Schefczyk wrote:
> Dear Recipients,
> 
> As per http://www.linux-kvm.org/page/Bugs, I am seeking advice
> regarding where to file the following bug:

If you're experiencing it on CentOS

http://bugs.centos.org/

If you're experiencing it on RHEL, you could file it (a clear reproducer
will increase the likelihood of developer attention) in Red Hat
bugzilla, under Product "Red Hat Enterprise Linux 6", Component
"qemu-kvm":


https://bugzilla.redhat.com/enter_bug.cgi?product=Red%20Hat%20Enterprise%20Linux%206

-- 
/kashyap
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: nVMX: check for null vmcs12 when L1 does invept

2014-03-22 Thread Bandan Das
Jan Kiszka  writes:

> On 2014-03-20 21:58, Bandan Das wrote:
>> Jan Kiszka  writes:
>> 
>>> On 2014-03-20 04:28, Bandan Das wrote:
 Some L1 hypervisors such as Xen seem to be calling invept after
 vmclear or before vmptrld on L2. In this case, proceed with
 falling through and syncing roots as a case where
 context wide invalidation can't be supported
>>>
>>> Can we also base this behaviour on a statement in the SDM? But on first
>>> glance, I do not find anything like this over there.
>> 
>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>> "Operations that invalidate Cached Mappings" does mention that
>> the instruction may invalidate mappings associated with other
>> EP4TAs (even in single context).
>
> Yes, "may". So we are implementing undefined behavior in order to please
> a broken hypervisor that relies on it? Then please state this in the
> patch and probably also inform Xen about their issue.

Why undefined behavior ? We don't do anything specific for 
the single context invalidation case ianyway .e If the eptp matches what 
vmcs12 has, single context invalidation does fall though to the global 
invalidation case already. All this change does is add the "L1 calls 
invept after vmclear and  before vmptrld" to the list of cases to fall 
though to global invalidation since nvmx doesn't have any knowledge of 
the current eptp for this case.

Or do you think we should rethink this approach ?

>> 
>> Note that I based this on what we currently do for context invalidation -
>> static inline void ept_sync_context(u64 eptp)
>> {
>>  if (enable_ept) {
>>  if (cpu_has_vmx_invept_context())
>>  __invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
>>  else
>>  ept_sync_global();
>>  }
>> }
>
> Don't get your point. This test is about testing for the CPU support
> context invalidating, then falling back to global invalidation if there
> is no support.

Sorry, if this was confusing. All I was trying to say is switching to global
invalidation if we can't do single context invalidation for some reason 
is not unusual.

Thanks,
Bandan

> Jan
>
>> 
>> Seemed easier and cleaner than having a cached eptp after vmcs12 is 
>> long gone :)
>> 
>> If you prefer, I can modify the commit message to reflect this.
>> 
>>> Jan
>>>

 Signed-off-by: Bandan Das 
 ---
  arch/x86/kvm/vmx.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index c707389..b407b3a 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
  
switch (type) {
case VMX_EPT_EXTENT_CONTEXT:
 -  if ((operand.eptp & eptp_mask) !=
 -  (nested_ept_get_cr3(vcpu) & eptp_mask))
 +  if (get_vmcs12(vcpu) &&
 +  ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
 +  eptp_mask)))
break;
case VMX_EPT_EXTENT_GLOBAL:
kvm_mmu_sync_roots(vcpu);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Grant Likely
On Fri, 21 Mar 2014 19:29:50 -0700, Christoffer Dall 
 wrote:
> On Fri, Mar 07, 2014 at 08:24:18PM +0800, Grant Likely wrote:
> > On Thu, 6 Mar 2014 12:04:50 +, Robie Basak  
> > wrote:
> > > On Thu, Mar 06, 2014 at 12:44:57PM +0100, Laszlo Ersek wrote:
> > > > If I understand correctly, the question is this:
> > > > 
> > > >   Given a hypervisor that doesn't support non-volatile UEFI variables
> > > >   (including BootOrder and Boot), is it possible to automatically
> > > >   boot a carefully prepared VM image, made available as a fixed media
> > > >   device?
> > > > 
> > > > The answer is "yes". See
> > > 
> > > Right, but I think there is a subsequent problem.
> > > 
> > > >   It is expected that this default boot will load an operating system or
> > > >   a maintenance utility. If this is an operating system setup program it
> > > >   is then responsible for setting the requisite environment variables
> > > >   for subsequent boots. The platform firmware may also decide to recover
> > > 
> > > >   or set to a known set of boot options.
> > > 
> > > It seems to me that the guest OS is permitted to assume that persistent
> > > boot variables will work after first boot, for subsequent boots.
> > > 
> > > So, for example, the guest OS might, on bootloader or kernel upgrade,
> > > completely replace the boot mechanism, dropping the removable path and
> > > replacing it with a fixed disk arrangement, setting boot variables
> > > appropriately, and assume that it can reboot and everything will
> > > continue to work.
> > > 
> > > But if the host does not support non-volatile variables, then this will
> > > break.
> > 
> > Correct
> > 
> > > This is why I'm suggesting that the specification mandate that the guest
> > > OS shipped in a "portable disk image" as defined by the spec must not
> > > make this assumption.
> > 
> > Also correct... the installer must be aware of this constraint which is
> > why it is part of the draft spec.
> > 
> > > It's either this, or mandate that all hosts must support persistent
> > > variables. I have no objection to that in principle, but since we have
> > > no implementation currently, it seems easier to avoid this particular
> > > roadblock by tweaking the spec in a way that nobody seems to care about
> > > anyway.
> > 
> > Right. I guess my position is that if persistent storage is not
> > implemented then there are a number of install/upgrade scenarios that
> > won't work. Regardless, portable images must assume an empty boot list
> > and we can build that into the spec.
> > 
> 
> Sorry for the delay in responding - all sorts of unexpected things
> happened when I returned from LCA14.
> 
> I agree on the technical discussion going on here.  My conclusion is
> that we have two options:
> 
> 1. Simply mandate that VM implementations support persistent variables
>for their UEFI implementation - with whatever constraints that may
>put on higher level tools.
> 
> 2. Require that OSes shipped as part of compliant VM images make no
>assumption that changes to the UEFI environment will be stored.
> 
> I feel that option number two will break in all sorts of cases, just
> like Grant stated above, and it is fundamentally not practical; if a
> distribution ships Linux with a UEFI stub that expects to be able to do
> something, distributions must modify Linux to conform to this spec.  I
> think imagining that this spec controls how UEFI support in Linux/Grub
> is done in general would be overreaching.  Additionally, Michael brought
> up the fact that it would be non-UEFI compliant.

That isn't actually my position. I absolutely think that VMs /should/
implement persistent variables, but the variables are a property of a VM
instance, not of the disk image. As far as this spec is concerned, I
think portable disk images should operate under the assumption of an
empty set of variables, and therefore follow the removable disk
requirements in the UEFI spec.

I would propose a modification to option 1:

VM implementations SHOULD to implement persistent variables for
their UEFI implementation - with whatever constraints that may be put on
higher level tools. Variable storage SHALL be a property of a VM
instance, but SHALL NOT be stored as part of a portable disk image.
Portable disk images SHALL conform to the UEFI removable disk
requirements from the UEFI spec.

g.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there any consumer of virtio-balloon now?

2014-03-22 Thread Kai Huang

Just see the reply. Thanks a lot!

Thanks,
-Kai
On 2014年03月21日 17:03, Kashyap Chamarthy wrote:

On Fri, Mar 21, 2014 at 12:22:54PM +0800, Kai Huang wrote:

Hi,

I see the virtio-balloon is designed for memory auto balloon between
KVM host and guest, but from latest linux kernel mainline code, looks
currently there's no consumer actually using it? Would you let me know
who is the consumer if there's any?

>From a quick look up, I see at-least there are some users[1] on
libvirt-users list. And, here's a nice explanation/usage of it[2]


   [1] https://www.redhat.com/archives/libvirt-users/2011-October/msg00059.html
   [2] http://rwmj.wordpress.com/2010/07/17/virtio-balloon/



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is there any consumer of virtio-balloon now?

2014-03-22 Thread Kai Huang
Thanks Paolo. What's the user space tool / command to trigger the 
virtio_balloon functionality? Basically I am looking for the whole code 
patch that triggers the virtio_balloon.


Thanks,
-Kai
On 2014年03月21日 16:51, Paolo Bonzini wrote:

Il 21/03/2014 05:22, Kai Huang ha scritto:

Hi,

I see the virtio-balloon is designed for memory auto balloon between
KVM host and guest, but from latest linux kernel mainline code, looks
currently there's no consumer actually using it? Would you let me know
who is the consumer if there's any?


The virtio-balloon driver is in drivers/virtio/virtio_balloon.c. Right 
now it only does manual ballooning though, not automatic.


Paolo



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] KVM: nVMX: check for null vmcs12 when L1 does invept

2014-03-22 Thread Jan Kiszka
On 2014-03-20 21:58, Bandan Das wrote:
> Jan Kiszka  writes:
> 
>> On 2014-03-20 04:28, Bandan Das wrote:
>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>> vmclear or before vmptrld on L2. In this case, proceed with
>>> falling through and syncing roots as a case where
>>> context wide invalidation can't be supported
>>
>> Can we also base this behaviour on a statement in the SDM? But on first
>> glance, I do not find anything like this over there.
> 
> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
> "Operations that invalidate Cached Mappings" does mention that
> the instruction may invalidate mappings associated with other
> EP4TAs (even in single context).

Yes, "may". So we are implementing undefined behavior in order to please
a broken hypervisor that relies on it? Then please state this in the
patch and probably also inform Xen about their issue.

> 
> Note that I based this on what we currently do for context invalidation -
> static inline void ept_sync_context(u64 eptp)
> {
>   if (enable_ept) {
>   if (cpu_has_vmx_invept_context())
>   __invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
>   else
>   ept_sync_global();
>   }
> }

Don't get your point. This test is about testing for the CPU support
context invalidating, then falling back to global invalidation if there
is no support.

Jan

> 
> Seemed easier and cleaner than having a cached eptp after vmcs12 is 
> long gone :)
> 
> If you prefer, I can modify the commit message to reflect this.
> 
>> Jan
>>
>>>
>>> Signed-off-by: Bandan Das 
>>> ---
>>>  arch/x86/kvm/vmx.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index c707389..b407b3a 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>>  
>>> switch (type) {
>>> case VMX_EPT_EXTENT_CONTEXT:
>>> -   if ((operand.eptp & eptp_mask) !=
>>> -   (nested_ept_get_cr3(vcpu) & eptp_mask))
>>> +   if (get_vmcs12(vcpu) &&
>>> +   ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
>>> +   eptp_mask)))
>>> break;
>>> case VMX_EPT_EXTENT_GLOBAL:
>>> kvm_mmu_sync_roots(vcpu);
>>>



signature.asc
Description: OpenPGP digital signature


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread Peter Wu
On Saturday 22 March 2014 10:50:45 Gleb Natapov wrote:
> On Fri, Mar 21, 2014 at 12:04:32PM -0700, Venkatesh Srinivas wrote:
> > On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu  wrote:
> [skip]
> 
> > When -cpu host is used, qemu/kvm passed the host CPUID F/M/S to the
> > guest. intel_pmu_cpu_*() -> intel_pmu_lbr_reset() uses rdmsr() /
> > wrmsr(), rather than the safe variants; if KVM does not support the
> > particular MSRs in question, you will see a #GP(0) there. See
> > https://lkml.org/lkml/2014/3/13/453 for a similar bug other PMU code.
> > 
> When kernel is compiled with guest support all rdmsr()/wrmsr() become _safe(),
> so the question for Peter is if his guest kernel has guest support enabled?

Linux guest support (CONFIG_HYPERVISOR_GUEST) was not enabled, see
.config in the first mail[1]. Enabling that option does not change the
situation.

With CONFIG_PARAVIRT and CONFIG_KVM_GUEST enabled, the PMU GPF is gone,
but now I have a NULL dereference (in rapl_pmu_init). Previously, when
`-cpu SandyBridge` was passed to qemu, it would show this:

[0.016995] Performance Events: unsupported p6 CPU model 42 no PMU 
driver, software events only.

The same NULL pointer deref would be visible (slightly different
addresses, but the Code lines are equal). With `-host`, the NULL deref
with `-cpu host` contains:

[0.016445] Performance Events: 16-deep LBR, IvyBridge events, Intel PMU 
driver.

Full dmesg below.

Stepping through the debugger (for `-cpu SandyBridge`) showed that the
pmu variable is NULL (in arch/x86/kernel/cpu/perf_event_intel_rapl.c)
Skipping that line (`jump +1` in gdb) allowed Linux to continue
booting without crashing. I have no idea what is going on here,
but hopefully this information helps.

Kind regards,
Peter

 [1]: https://lkml.kernel.org/r/4055058.qLAukpngnj@al

[0.00] Linux version 3.14.0-rc7-qemu-00059-g08edb33 (pc@antartica) (gcc 
version 4.8.2 (Ubuntu 4.8.2-16ubuntu6) ) #27 Sat Mar 22 10:06:46 CET 2014
[0.00] Command line: console=ttyS0 loglevel=8
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3fffdfff] usable
[0.00] BIOS-e820: [mem 0x3fffe000-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved
[0.00] BIOS-e820: [mem 0xfffc-0x] reserved
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.4 present.
[0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011
[0.00] Hypervisor detected: KVM
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3fffe max_arch_pfn = 0x4
[0.00] MTRR default type: write-back
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 008000 mask FF8000 uncachable
[0.00]   1 disabled
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86 PAT enabled: cpu 0, old 0x70406, new 0x7010600070106
[0.00] found SMP MP-table at [mem 0x000f1b10-0x000f1b1f] mapped at 
[880f1b10]
[0.00] Base memory trampoline at [88099000] 99000 size 24576
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] BRK [0x019d1000, 0x019d1fff] PGTABLE
[0.00] BRK [0x019d2000, 0x019d2fff] PGTABLE
[0.00] BRK [0x019d3000, 0x019d3fff] PGTABLE
[0.00] init_memory_mapping: [mem 0x3fc0-0x3fdf]
[0.00]  [mem 0x3fc0-0x3fdf] page 2M
[0.00] init_memory_mapping: [mem 0x3c00-0x3fbf]
[0.00]  [mem 0x3c00-0x3fbf] page 2M
[0.00] init_memory_mapping: [mem 0x0010-0x3bff]
[0.00]  [mem 0x0010-0x001f] page 4k
[0.00]  [mem 0x0020-0x3bff] page 2M
[0.00] init_memory_mapping: [mem 0x3fe0-0x3fffdfff]
[0.00]  [mem 0x3fe0-0x3fffdfff] page 4k
[0.00] BRK [0x019d4000, 0x019d4fff] PGTABLE
[0.00] ACPI: RSDP 000f19b0 14 (v00 BOCHS )
[0.00] ACPI: RSDT 3ad3 34 (v01 BOCHS  BXPCRSDT 0001 
BXPC 0001)
[0.00] ACPI: FACP 3177 74 (v01 BOCHS  BXPCFACP 0001 
BXPC 0001)
[0.00] ACPI: DSDT 3fffe040 00

Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread Gleb Natapov
On Fri, Mar 21, 2014 at 12:04:32PM -0700, Venkatesh Srinivas wrote:
> On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu  wrote:
[skip]

> When -cpu host is used, qemu/kvm passed the host CPUID F/M/S to the
> guest. intel_pmu_cpu_*() -> intel_pmu_lbr_reset() uses rdmsr() /
> wrmsr(), rather than the safe variants; if KVM does not support the
> particular MSRs in question, you will see a #GP(0) there. See
> https://lkml.org/lkml/2014/3/13/453 for a similar bug other PMU code.
> 
When kernel is compiled with guest support all rdmsr()/wrmsr() become _safe(),
so the question for Peter is if his guest kernel has guest support enabled?
 
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Where to file this bug, please?

2014-03-22 Thread Prof. Dr. Michael Schefczyk
Dear Recipients,

As per http://www.linux-kvm.org/page/Bugs, I am seeking advice regarding where 
to file the following bug:

I think that the current RHEL/Centos-Version of KVM 
(qemu-kvm-0.12.1.2-2.415.el6_5.6.x86_64) on Centos 6.5 does make it 
substantially more difficult to install a Windows 7 Professional 64 guest. When 
run, the installer does show Microsoft artwork, but then asks for a missing 
CD-ROM driver. This is not intuitive, as the image connected as a CD-ROM must 
be readable in order to get the Microsoft artwork in the first place. Then, the 
driver is not available and the installation process stops (did not try PXE, 
yet). This did not occur in the past and this has nothing to do with the usual 
Windows VirtIO drivers available from Fedora. The VirtIO drivers do not solve 
this and they do not deal with CD-ROMs as far as I know.

Regards,

Michael

Prof. Dr. Michael Schefczyk 
Steuerberater
Selliner Str. 10
D-01109 Dresden 

Fon: +49-3 51-4 59 56 40 
Fax: +49-3 51-4 59 56 50 
www.stb-schefczyk.de

USt-IdNr. DE 202504562


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC] ARM VM System Sepcification

2014-03-22 Thread Paolo Bonzini

Il 22/03/2014 03:29, Christoffer Dall ha scritto:

1. Simply mandate that VM implementations support persistent variables
   for their UEFI implementation - with whatever constraints that may
   put on higher level tools.

2. Require that OSes shipped as part of compliant VM images make no
   assumption that changes to the UEFI environment will be stored.

I feel that option number two will break in all sorts of cases, just
like Grant stated above, and it is fundamentally not practical; if a
distribution ships Linux with a UEFI stub that expects to be able to do
something, distributions must modify Linux to conform to this spec.  I
think imagining that this spec controls how UEFI support in Linux/Grub
is done in general would be overreaching.  Additionally, Michael brought
up the fact that it would be non-UEFI compliant.


OSes are already able to cope with loss of changes to UEFI environment 
are stored, because losing persistent variables is what happens if you 
copy an image to a new hard disk.


Asking implementations for support of persistent variables is a good 
idea; however, independent of what is in the spec, OSes should not 
expect that users will enable that support---most of them won't.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-22 Thread Paolo Bonzini

Il 21/03/2014 20:09, H. Peter Anvin ha scritto:

Calling this a bug in the PMU code is ridiculous.  If KVM tells the
system it os a specific vendor-family-model-stepping but diverges in
behavior then it, by definition, is broken.


Yeah, this is true.  On AMD there is processor support for virtualizing 
LBR, but Intel doesn't have it.  I'm not sure if generic load/save MSR 
support could be used to do it.


Unfortunately, LBR does not have any CPUID bit to show its presence, 
unlike a lot of other perf-related features.  So, even though calling it 
a bug in perf code is an exaggeration, using rdmsr_safe makes sense.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] KVM: ioapic: clear IRR for edge-triggered interrupts at delivery

2014-03-22 Thread Paolo Bonzini

Il 21/03/2014 19:34, Eduardo Habkost ha scritto:

> +  if (irqe.trig_mode == IOAPIC_EDGE_TRIG)
> +  ioapic->irr &= ~(1 << irq);
> +

Now, every call to ioapic_service() for an edge interrupt clears the IRR
bit immediately (assuming the mask is unset).

If the IRR bit is immediately zero on delivery, why won't this break the
edge detection logic on kvm_ioapic_set_irq()? Am I missing some
additional detail?


That logic will still trigger if the interrupt is masked in the IOAPIC's 
ioredirtbl.



In other words, won't this cause spurious interrupts if
kvm_ioapic_set_irq(..., true) is called twice?


Yes, and this is why I don't like this patch very much.  Basically it 
leaves it up to userspace to only send edge-triggered interrupts on an 
actual rising edge and never do two consecutive kvm_ioapic_set_irq(..., 
true) ioctls.


On the other hand, treating IRR this way is how QEMU's userspace IOAPIC 
works already, so the chance of bugs is smaller than any alternative; 
and the alternatives aren't that good either.  For example, I had 
thought about using the remote_irr bit to store the status.  In order to 
keep the old behavior where remote_irr is zero for edge-triggered 
interrupts, the bit can be masked out when reading the ioredirtbl.


KVM_SET_IRQCHIP then could look at irr & ~remote_irr to find interrupts 
that have to be delivered.  However, I was afraid that this would cause 
problems on migration from new to old kernels, which would let userspace 
see remote_irr=1 for edge-triggered interrupts.


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html