Re: Release plan for 0.12.0

2009-10-20 Thread Takahiro Hirofuchi
Hello,


2009/9/30 Anthony Liguori aligu...@us.ibm.com:
 Hi,

 Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.

 o storage live migration

Sorry for a bit off topic. But, my special NBD server can do this
independently of VMM implementations.
See http://bitbucket.org/hirofuchi/xnbd/wiki/Home if interested.


Takahiro
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nice value is ignored on cpu time accounting of a guest?

2009-10-20 Thread Ryota Ozaki
On Tue, Oct 20, 2009 at 2:21 PM, Avi Kivity a...@redhat.com wrote:
 On 10/19/2009 06:46 PM, Ryota Ozaki wrote:

 Hi,

 I have a question on cputime accounting of a guest. CPU time of a
 guest is always
 accounted as 'user' time of cpustat even if nice value of the guest is
 higher than 0.
 Is there a reason to do so? I think the cpu time of the guest should
 be accounted
 into 'nice' as same as a normal process. Am I wrong?


 Hm, guest time is accounted separately, and added to user time in /proc (so
 tools that don't know about guest time can read it as user time).

Yes, but I think always added to user time without regard to nice
value is a problem.
I want to fix it because user time is an account for processes that
have nice == 0.


 Looks like we need to add a separate guest_nice, or get rid of guest time
 altogether.

Hmm, guest time is already exposed via /proc/stat so adding guest_nice is better
if fix here? I don't know anyone utilize 'guest' value though.

  ozaki-r


 --
 I have a truly marvellous patch that fixes the bug which this
 signature is too narrow to contain.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nice value is ignored on cpu time accounting of a guest?

2009-10-20 Thread Avi Kivity

On 10/20/2009 04:06 PM, Ryota Ozaki wrote:

Looks like we need to add a separate guest_nice, or get rid of guest time
altogether.
 

Hmm, guest time is already exposed via /proc/stat so adding guest_nice is better
if fix here? I don't know anyone utilize 'guest' value though.
   


No one uses guest time to my knowledge.  However, we can't be sure, so 
it's better to add guest_nice.


Note you need to add guest_nice to user_nice, so old tools see it as 
nice time (same as guest_time now).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nice value is ignored on cpu time accounting of a guest?

2009-10-20 Thread Avi Kivity

On 10/20/2009 04:27 PM, Ryota Ozaki wrote:

On Tue, Oct 20, 2009 at 4:17 PM, Avi Kivitya...@redhat.com  wrote:
   

On 10/20/2009 04:06 PM, Ryota Ozaki wrote:
 

Looks like we need to add a separate guest_nice, or get rid of guest time
altogether.

 

Hmm, guest time is already exposed via /proc/stat so adding guest_nice is
better
if fix here? I don't know anyone utilize 'guest' value though.

   

No one uses guest time to my knowledge.  However, we can't be sure, so it's
better to add guest_nice.

Note you need to add guest_nice to user_nice, so old tools see it as nice
time (same as guest_time now).
 

Well, like this?

 /* Add user time to cpustat. */
 tmp = cputime_to_cputime64(cputime);
 if (TASK_NICE(p)  0) {
 cpustat-nice = cputime64_add(cpustat-nice, tmp);
 cpustat-guest_nice = cputime64_add(cpustat-guest_nice, tmp);
 } else {
 cpustat-user = cputime64_add(cpustat-user, tmp);
 cpustat-guest = cputime64_add(cpustat-guest, tmp);
 }
   


In account_guest_time()?  Yes.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nice value is ignored on cpu time accounting of a guest?

2009-10-20 Thread Ryota Ozaki
On Tue, Oct 20, 2009 at 4:34 PM, Avi Kivity a...@redhat.com wrote:
 On 10/20/2009 04:27 PM, Ryota Ozaki wrote:

 Well, like this?

         /* Add user time to cpustat. */
         tmp = cputime_to_cputime64(cputime);
         if (TASK_NICE(p)  0) {
                 cpustat-nice = cputime64_add(cpustat-nice, tmp);
                 cpustat-guest_nice = cputime64_add(cpustat-guest_nice,
 tmp);
         } else {
                 cpustat-user = cputime64_add(cpustat-user, tmp);
                 cpustat-guest = cputime64_add(cpustat-guest, tmp);
         }


 In account_guest_time()?  Yes.

Yes.

OK, I'll send a patch later. Thanks!

  ozaki-r
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix up vmx_set_segment for booting older guests.

2009-10-20 Thread Chris Lalancette
If a guest happens to be unlucky enough to use an address
such as 0xc000 in the CS base address field, the next attempt
to VM enter will fail.  This is because the vmcs_writel() that
writes the base address into the VMCS will sign-extend the field
to 64-bits, and the Intel manual states that bits 63:32 of this
field *must* be 0.  Use vmcs_write32() where appropriate.
This fixes booting of an absolutely ancient Red Hat Linux 5.2
(not Enterprise Linux!) guest.

Signed-off-by: Chris Lalancette clala...@redhat.com
---
 arch/x86/kvm/vmx.c |   17 -
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 364263a..311afd4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1846,7 +1846,22 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
vmx-rmode.tr.ar = vmx_segment_access_rights(var);
return;
}
-   vmcs_writel(sf-base, var-base);
+
+   /* Intel 64 and IA-32 Architecture Software Developer's Manual Vol. 3b,
+* section 22.3.1.2 states that VMENTRY will fail if bits 63:32 of the
+* base address for CS, SS, DS, ES are not 0 and the register is usable.
+*
+* If var-base happens to have bit 31 set, then it will get sign
+* extended on the vmcs_writel(), causing this check to fail.  Make
+* sure to use the 32-bit version where appropriate.
+*/
+   if (sf-base == GUEST_CS_BASE ||
+   ((~sf-ar_bytes  0x0001)  (sf-base == GUEST_SS_BASE ||
+ sf-base == GUEST_DS_BASE ||
+ sf-base == GUEST_ES_BASE)))
+   vmcs_write32(sf-base, var-base);
+   else
+   vmcs_writel(sf-base, var-base);
vmcs_write32(sf-limit, var-limit);
vmcs_write16(sf-selector, var-selector);
if (vmx-rmode.vm86_active  var-s) {
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Chris Lalancette
If we fail to handle a VMEXIT for some reason, print out a lot
more debugging information about the state of the GUEST VMCS
area.  This does not fix a bug, but helps a lot when trying to
track down the cause of a VMEXIT/VMENTRY failure.

Signed-off-by: Chris Lalancette clala...@redhat.com
---
 arch/x86/kvm/vmx.c |   38 ++
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 311afd4..37b1682 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3452,6 +3452,14 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
 static const int kvm_vmx_max_exit_handlers =
ARRAY_SIZE(kvm_vmx_exit_handlers);
 
+#define PRINT_GUEST_SEGMENT(seg) do {  \
+   printk(KERN_DEBUG #seg : SELECTOR 0x%lx, BASE 0x%lx, LIMIT 0x%lx, AR 
0x%lx\n, \
+  vmcs_readl(GUEST_##seg##_SELECTOR),  \
+  vmcs_readl(GUEST_##seg##_BASE),  \
+  vmcs_readl(GUEST_##seg##_LIMIT), \
+  vmcs_readl(GUEST_##seg##_AR_BYTES)); \
+   while(0)
+
 /*
  * The guest has exited.  See if we can fix it or if we need userspace
  * assistance.
@@ -3512,6 +3520,36 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
else {
vcpu-run-exit_reason = KVM_EXIT_UNKNOWN;
vcpu-run-hw.hardware_exit_reason = exit_reason;
+
+   printk(KERN_DEBUG GUEST STATE:\n);
+   printk(KERN_DEBUG CR0: 0x%lx\n, vmcs_readl(GUEST_CR0));
+   printk(KERN_DEBUG CR3: 0x%lx\n, vmcs_readl(GUEST_CR3));
+   printk(KERN_DEBUG CR4: 0x%lx\n, vmcs_readl(GUEST_CR4));
+   printk(KERN_DEBUG VMENTRY CONTROL: 0x%lx\n,
+  vmcs_readl(VM_ENTRY_CONTROLS));
+   printk(KERN_DEBUG DR7: 0x%lx\n, vmcs_readl(GUEST_DR7));
+   printk(KERN_DEBUG SYSENTER ESP: 0x%lx\n,
+  vmcs_readl(GUEST_SYSENTER_ESP));
+   printk(KERN_DEBUG SYSENTER EIP: 0x%lx\n,
+  vmcs_readl(GUEST_SYSENTER_EIP));
+
+   PRINT_GUEST_SEGMENT(CS);
+   PRINT_GUEST_SEGMENT(SS);
+   PRINT_GUEST_SEGMENT(DS);
+   PRINT_GUEST_SEGMENT(ES);
+   PRINT_GUEST_SEGMENT(FS);
+   PRINT_GUEST_SEGMENT(GS);
+   PRINT_GUEST_SEGMENT(TR);
+   PRINT_GUEST_SEGMENT(LDTR);
+
+   printk(KERN_DEBUG GDTR: BASE 0x%lx, LIMIT 0x%lx,
+  vmcs_readl(GUEST_GDTR_BASE),
+  vmcs_readl(GUEST_GDTR_LIMIT));
+   printk(KERN_DEBUG IDTR: BASE 0x%lx, LIMIT 0x%lx,
+  vmcs_readl(GUEST_IDTR_BASE),
+  vmcs_readl(GUEST_IDTR_LIMIT));
+   printk(KERN_DEBUG RIP: 0x%lx\n,vmcs_readl(GUEST_RIP));
+   printk(KERN_DEBUG RFLAGS: 0x%lx\n,vmcs_readl(GUEST_RFLAGS));
}
return 0;
 }
-- 
1.6.0.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 09:50:45AM +0200, Chris Lalancette wrote:
 If we fail to handle a VMEXIT for some reason, print out a lot
 more debugging information about the state of the GUEST VMCS
 area.  This does not fix a bug, but helps a lot when trying to
 track down the cause of a VMEXIT/VMENTRY failure.
 
 Signed-off-by: Chris Lalancette clala...@redhat.com
 ---
  arch/x86/kvm/vmx.c |   38 ++
  1 files changed, 38 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 311afd4..37b1682 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3452,6 +3452,14 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
 *vcpu) = {
  static const int kvm_vmx_max_exit_handlers =
   ARRAY_SIZE(kvm_vmx_exit_handlers);
  
 +#define PRINT_GUEST_SEGMENT(seg) do {
 \
 + printk(KERN_DEBUG #seg : SELECTOR 0x%lx, BASE 0x%lx, LIMIT 0x%lx, AR 
 0x%lx\n, \
 +vmcs_readl(GUEST_##seg##_SELECTOR),  \
 +vmcs_readl(GUEST_##seg##_BASE),  \
 +vmcs_readl(GUEST_##seg##_LIMIT), \
 +vmcs_readl(GUEST_##seg##_AR_BYTES)); \
 + while(0)
 +
  /*
   * The guest has exited.  See if we can fix it or if we need userspace
   * assistance.
 @@ -3512,6 +3520,36 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
   else {
   vcpu-run-exit_reason = KVM_EXIT_UNKNOWN;
   vcpu-run-hw.hardware_exit_reason = exit_reason;
 +
 + printk(KERN_DEBUG GUEST STATE:\n);
 + printk(KERN_DEBUG CR0: 0x%lx\n, vmcs_readl(GUEST_CR0));
 + printk(KERN_DEBUG CR3: 0x%lx\n, vmcs_readl(GUEST_CR3));
 + printk(KERN_DEBUG CR4: 0x%lx\n, vmcs_readl(GUEST_CR4));
 + printk(KERN_DEBUG VMENTRY CONTROL: 0x%lx\n,
 +vmcs_readl(VM_ENTRY_CONTROLS));
 + printk(KERN_DEBUG DR7: 0x%lx\n, vmcs_readl(GUEST_DR7));
 + printk(KERN_DEBUG SYSENTER ESP: 0x%lx\n,
 +vmcs_readl(GUEST_SYSENTER_ESP));
 + printk(KERN_DEBUG SYSENTER EIP: 0x%lx\n,
 +vmcs_readl(GUEST_SYSENTER_EIP));
 +
 + PRINT_GUEST_SEGMENT(CS);
 + PRINT_GUEST_SEGMENT(SS);
 + PRINT_GUEST_SEGMENT(DS);
 + PRINT_GUEST_SEGMENT(ES);
 + PRINT_GUEST_SEGMENT(FS);
 + PRINT_GUEST_SEGMENT(GS);
 + PRINT_GUEST_SEGMENT(TR);
 + PRINT_GUEST_SEGMENT(LDTR);
 +
 + printk(KERN_DEBUG GDTR: BASE 0x%lx, LIMIT 0x%lx,
 +vmcs_readl(GUEST_GDTR_BASE),
 +vmcs_readl(GUEST_GDTR_LIMIT));
 + printk(KERN_DEBUG IDTR: BASE 0x%lx, LIMIT 0x%lx,
 +vmcs_readl(GUEST_IDTR_BASE),
 +vmcs_readl(GUEST_IDTR_LIMIT));
 + printk(KERN_DEBUG RIP: 0x%lx\n,vmcs_readl(GUEST_RIP));
 + printk(KERN_DEBUG RFLAGS: 0x%lx\n,vmcs_readl(GUEST_RFLAGS));
   }
   return 0;
Move this to separate function may be?  vmx_handle_exit() will be hard
to read with this blob in the middle.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix up vmx_set_segment for booting older guests.

2009-10-20 Thread Avi Kivity

On 10/20/2009 04:50 PM, Chris Lalancette wrote:

If a guest happens to be unlucky enough to use an address
such as 0xc000 in the CS base address field, the next attempt
to VM enter will fail.  This is because the vmcs_writel() that
writes the base address into the VMCS will sign-extend the field
to 64-bits, and the Intel manual states that bits 63:32 of this
field *must* be 0.  Use vmcs_write32() where appropriate.
This fixes booting of an absolutely ancient Red Hat Linux 5.2
(not Enterprise Linux!) guest.



diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 364263a..311afd4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1846,7 +1846,22 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu,
vmx-rmode.tr.ar = vmx_segment_access_rights(var);
return;
}
-   vmcs_writel(sf-base, var-base);
+
+   /* Intel 64 and IA-32 Architecture Software Developer's Manual Vol. 3b,
+* section 22.3.1.2 states that VMENTRY will fail if bits 63:32 of the
+* base address for CS, SS, DS, ES are not 0 and the register is usable.
+*
+* If var-base happens to have bit 31 set, then it will get sign
+* extended on the vmcs_writel(), causing this check to fail.  Make
+* sure to use the 32-bit version where appropriate.
+*/
+   if (sf-base == GUEST_CS_BASE ||
+   ((~sf-ar_bytes  0x0001)  (sf-base == GUEST_SS_BASE ||
+ sf-base == GUEST_DS_BASE ||
+ sf-base == GUEST_ES_BASE)))
+   vmcs_write32(sf-base, var-base);
   


This will leave high bits untouched, so if any were set, this will fail.


+   else
+   vmcs_writel(sf-base, var-base);
vmcs_write32(sf-limit, var-limit);
   


I think the correct fix is to zero extend in vmcs_writel() rather than 
here.  But as far as I can tell, it already does.  Where does the sign 
extension occur?  Perhaps in userspace?


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Avi Kivity

On 10/20/2009 04:50 PM, Chris Lalancette wrote:

If we fail to handle a VMEXIT for some reason, print out a lot
more debugging information about the state of the GUEST VMCS
area.  This does not fix a bug, but helps a lot when trying to
track down the cause of a VMEXIT/VMENTRY failure.
   


register state can just as easily be examined in the qemu monitor.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Do I set up separate bridges for each guest?

2009-10-20 Thread Dor Laor

On 10/20/2009 04:37 AM, Neil Aggarwal wrote:

Hello:

I am installing KVM on top of CentOS 5.4 so I can
have two guests running on my host. I would like to
have the host and guests accessible from my
network.

Do I set up separate bridges for each guest or would
they somehow be shared?

If I set up separate bridges, I think I need to do
in /etc/sysconfig/network-scripts on the host machine:

1. Set up ifcfg-eth0 with the ip information of the
host (For example 192.168.2.200)
2. Set up ifcfg-eth0:1 for the first guest.  It will
have BRIDGE=br1
3. Create ifcfg-br1 with the IP info for the first
guest (For example 192.168.2.201)
4. Set up ifcfg-eth0:2 for the second guest.  It will
have BRIDGE=br2
5. Create ifcfg-br2 with the IP info for the second
guest (For example 192.168.2.202)

Is this correct or did I miss something?


The simplest thing is to use a single bridge for all -
The physical nic should be part of it and supply the outside world 
connection. The physical nic doesn't need an IP and the bridge should 
own it. All vms can use this bridge.


cat /etc/sysconfig/network-scripts/ifcfg-br0
DEVICE=br0
TYPE=Bridge
ONBOOT=yes
GATEWAYDEV=''
BOOTPROTO=dhcp
DELAY=0
HWADDR=00:14:5E:17:D0:04
# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
HWADDR=00:14:5E:17:D0:04
BRIDGE=br0




Thanks,
Neil


--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Chris Lalancette
Avi Kivity wrote:
 On 10/20/2009 04:50 PM, Chris Lalancette wrote:
 If we fail to handle a VMEXIT for some reason, print out a lot
 more debugging information about the state of the GUEST VMCS
 area.  This does not fix a bug, but helps a lot when trying to
 track down the cause of a VMEXIT/VMENTRY failure.

 
 register state can just as easily be examined in the qemu monitor.
 

Ah, true.  OK, forget this patch.

-- 
Chris Lalancette
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Jan Kiszka
Marcelo Tosatti wrote:
 On Thu, Oct 15, 2009 at 07:05:36PM +0200, Jan Kiszka wrote:
 This plugs an NMI-related hole in the VCPU synchronization between
 kernel and user space. So far, neither pending NMIs nor the inhibit NMI
 mask was properly read/set which was able to cause problems on
 vmsave/restore, live migration and system reset. Fix it by making use
 of the new VCPU substate interface.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---

  Documentation/kvm/api.txt   |   12 
  arch/x86/include/asm/kvm.h  |7 +++
  arch/x86/include/asm/kvm_host.h |2 ++
  arch/x86/kvm/svm.c  |   22 ++
  arch/x86/kvm/vmx.c  |   30 ++
  arch/x86/kvm/x86.c  |   26 ++
  6 files changed, 99 insertions(+), 0 deletions(-)

 diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
 index bee5bbd..e483edb 100644
 --- a/Documentation/kvm/api.txt
 +++ b/Documentation/kvm/api.txt
 @@ -848,3 +848,15 @@ Deprecates: KVM_GET/SET_CPUID2
  Architectures: x86
  Payload: struct kvm_lapic
  Deprecates: KVM_GET/SET_LAPIC
 +
 +6.8 KVM_X86_VCPU_STATE_NMI
 +
 +Architectures: x86
 +Payload: struct kvm_nmi_state
 +Deprecates: -
 +
 +struct kvm_nmi_state {
 +   __u8 pending;
 +   __u8 masked;
 +   __u8 pad1[6];
 
 Don't you also have to save nmi_injected, in case of failure during
 NMI delivery.
 

Something made me think it's not required. Don't ask me what, it was
wrong anyway. Will roll out -v3 for this patch.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Nikola Ciprich
Hi,
maybe it's stupid question, but is this available also when qemu/kvm
is started using libvirt  stuff? I think it uses monitor so it's 
inaccessible for user no?
n.



On Tue, Oct 20, 2009 at 10:42:24AM +0200, Chris Lalancette wrote:
 Avi Kivity wrote:
  On 10/20/2009 04:50 PM, Chris Lalancette wrote:
  If we fail to handle a VMEXIT for some reason, print out a lot
  more debugging information about the state of the GUEST VMCS
  area.  This does not fix a bug, but helps a lot when trying to
  track down the cause of a VMEXIT/VMENTRY failure.
 
  
  register state can just as easily be examined in the qemu monitor.
  
 
 Ah, true.  OK, forget this patch.
 
 -- 
 Chris Lalancette
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Jan Kiszka
Avi Kivity wrote:
 On 10/20/2009 05:39 AM, Gleb Natapov wrote:
 BTW, what happens to exceptions that fail to be delivered? Can't see
 where they are saved/restored across migration.

  
 The instruction that caused an exception will be re-executed after
 migration and exception will be regenerated.
 
 Except for debug exceptions (traps).
 
 But I think we should
 migrate exception anyway for completeness.

 
 Yes.

So save/restore kvm_vcpu_arch::exception? As another substate or as part
of a generalized NMI substate?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Chris Lalancette
Nikola Ciprich wrote:
 Hi,
 maybe it's stupid question, but is this available also when qemu/kvm
 is started using libvirt  stuff? I think it uses monitor so it's 
 inaccessible for user no?

Yes and no.  The monitor is inaccessible when using libvirt, but I totally
forgot that qemu dumps the register state to stderr before abort()'ing on an
unknown vm exit.  Libvirt takes the output from stderr and stores it in
/var/log/libvirt/qemu/guestname.  So you would still be able to see this
output when using libvirt.

-- 
Chris Lalancette
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


0.11: SMP guests using one host CPU only?

2009-10-20 Thread Tomasz Chmielewski

On a 8 CPU host, I created a guest with 4 CPUs (-smp 4).

Unfortunately, the guest only uses one host CPU.
For example, running cat /dev/urandom | gzip -9 /dev/null  several 
times on this guest causes load on only one host CPU.


Is it expected?

The host is running 2.6.32-rc5 and qemu-kvm-0.11. I also tried 2.6.31.5 
with qemu-kvm-0.11 with the same result.



I have another machine, running 2.6.24 kernel, where it works just fine 
(running several CPU-intensive tasks on a guest result in several host 
CPUs being loaded).




--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Avi Kivity

On 10/20/2009 05:56 PM, Jan Kiszka wrote:

So save/restore kvm_vcpu_arch::exception? As another substate or as part
of a generalized NMI substate?
   


Yes.  It's not part of an nmi substate, but both can be part of an 
exception substate (but need to look at the docs vewy cawefuwy to make 
sure we don't screw up again).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote:
 On 10/20/2009 05:56 PM, Jan Kiszka wrote:
 So save/restore kvm_vcpu_arch::exception? As another substate or as part
 of a generalized NMI substate?
 
 Yes.  It's not part of an nmi substate, but both can be part of an
 exception substate (but need to look at the docs vewy cawefuwy to
 make sure we don't screw up again).
 
What do you mean? How they can be both part of exception substate?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Print Guest VMCS state on vmexit failure

2009-10-20 Thread Avi Kivity

On 10/20/2009 05:57 PM, Chris Lalancette wrote:

Nikola Ciprich wrote:
   

Hi,
maybe it's stupid question, but is this available also when qemu/kvm
is started using libvirt  stuff? I think it uses monitor so it's
inaccessible for user no?
 

Yes and no.  The monitor is inaccessible when using libvirt, but I totally
forgot that qemu dumps the register state to stderr before abort()'ing on an
unknown vm exit.  Libvirt takes the output from stderr and stores it in
/var/log/libvirt/qemu/guestname.  So you would still be able to see this
output when using libvirt.
   


We've dropped the stderr part (IIRC), but nothing prevents libvirt from 
accessing the register state and providing it to the user.


There's also the multiple monitor support which can be used for 
debugging.  Finally, you can connect with gdb (need to dynamically start 
the gdb server via the monitor, again needs libvirt support).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Avi Kivity

On 10/20/2009 06:08 PM, Gleb Natapov wrote:

On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote:
   

On 10/20/2009 05:56 PM, Jan Kiszka wrote:
 

So save/restore kvm_vcpu_arch::exception? As another substate or as part
of a generalized NMI substate?
   

Yes.  It's not part of an nmi substate, but both can be part of an
exception substate (but need to look at the docs vewy cawefuwy to
make sure we don't screw up again).

 

What do you mean? How they can be both part of exception substate?

   


Sorry, nomenclature failure.  We need NMI state, Interrupt state 
(already provided), and pending exception state (which can be a fault or 
a trap).  There's also some extra state associated with pending debug 
exceptions (maybe we can copy it into dr6).


We can either put all of these into one substate, or into separate 
substates.  I'm not sure which is best.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Avi Kivity

On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote:

On a 8 CPU host, I created a guest with 4 CPUs (-smp 4).

Unfortunately, the guest only uses one host CPU.
For example, running cat /dev/urandom | gzip -9 /dev/null  several 
times on this guest causes load on only one host CPU.


Is it expected?


No.  What does 'top -H' show?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix up vmx_set_segment for booting older guests.

2009-10-20 Thread Chris Lalancette
Avi Kivity wrote:
 +else
 +vmcs_writel(sf-base, var-base);
  vmcs_write32(sf-limit, var-limit);

 
 I think the correct fix is to zero extend in vmcs_writel() rather than 
 here.  But as far as I can tell, it already does.  Where does the sign 
 extension occur?  Perhaps in userspace?

Very good question Avi.  I should have dug a bit deeper before posting.  I
traced this further back, and here's what it looks like is going on:

arch/x86/kvm/x86.c:kvm_load_segment_descriptor() is responsible for loading the
CPU segment descriptor into the VMCS area.  It does this by calling
load_segment_descriptor_to_kvm_desct(), doing a few minor transformations of the
data, then calling kvm_set_segment() to load it into the VMCS.

The problem arises in load_segment_descriptor_to_kvm_desct() -
seg_desct_to_kvm_desct().  seg_desct_to_kvm_desct() takes the struct desc_struct
(in this case, base0 == 0x0, base1 == 0x0, and base2 == 0xc0), then calls
get_desc_base() and stores the result in the struct kvm_segment.  The return
value from get_desc_base is It's here that the sign-extension occurs, which
eventually causes that VM entry failure.

get_desc_base() sign-extends because of some complicated u8 to unsigned rules
that I'm not completely sure of.  The below patch fixes my original issue, but
I'm not at all sure that this is the right thing to do.  I could also change
get_desc_base() itself to do the casting, which should do the right thing for
all callers, but I'm not sure if that's what all callers want.  Anybody else
have an opinion?


diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a93ba29..b58bda2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3997,7 +3997,7 @@ static void kvm_set_segment(struct kvm_vcpu *vcpu,
 static void seg_desct_to_kvm_desct(struct desc_struct *seg_desc, u16 selector,
   struct kvm_segment *kvm_desct)
 {
-   kvm_desct-base = get_desc_base(seg_desc);
+   kvm_desct-base = (unsigned)get_desc_base(seg_desc);
kvm_desct-limit = get_desc_limit(seg_desc);
if (seg_desc-g) {
kvm_desct-limit = 12;


-- 
Chris Lalancette
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Tomasz Chmielewski

Avi Kivity wrote:

On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote:

On a 8 CPU host, I created a guest with 4 CPUs (-smp 4).

Unfortunately, the guest only uses one host CPU.
For example, running cat /dev/urandom | gzip -9 /dev/null  several 
times on this guest causes load on only one host CPU.


Is it expected?


No.  What does 'top -H' show?


In the guest - 4 CPUs with ~100% usage each (when I press 1), otherwise, in the task 
list, multiple cat processes taking most CPU time (as it reads from /dev/urandom).


In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; when I press 
1, I see only one CPU is used 100%, 7 other CPUs are more or less not used.


guest command line:


/usr/bin/qemu-system-x86_64 -m 1024 -drive 
file=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on -net 
nic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F -net 
tap,vlan=0,script=/etc/qemu-ifup -localtime -smp 4


There are 5 other guests (1 CPU) started before this guest.


--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vhost-net patches

2009-10-20 Thread Michael S. Tsirkin
On Mon, Oct 19, 2009 at 03:56:54PM -0700, Sridhar Samudrala wrote:
 On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:
  On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote:
   On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote:
Hi Michael,

We are trying out your vhost-net patches from your git trees on 
kernel.org.
I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for qemu.

I am using the following qemu script to start the guest using userspace 
tap backend.

home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 
/home/sridhar/kvm_images/fedora10-1-vm -m 512 -drive 
file=/home/sridhar/kvm_images/fedora10-1-vm,if=virtio,index=0,boot=on 
-net nic,macaddr=54:52:00:35:e3:73,model=virtio -net 
tap,ifname=vnet0,script=no,downscript=no

Now that i got the default backend to work, i wanted to try vhost in 
kernel. But
could not figure out the right -net option to pass to qemu.

Can you let me know the right syntax to start a guest using vhost.

Thanks
Sridhar
   
   Here's an example with raw socket:
   
   /root/kvm-test/bin/qemu-system-x86_64 -m 1G -kernel \
   /boot/vmlinuz-$release -append \
   'root=UUID=d5d2d201-d086-42ad-bb1d-32fbe40eda71 ro quiet nosplash \
   console=tty0 console=ttyS0,9600n8' -initrd /boot/guest-initrd.img \
   $HOME/disk.raw.copy -net raw,ifname=eth3 -net nic,model=virtio,vhost \
   -balloon none -redir tcp:8023::22
   
   As you see, I changed the command line.
   You now simply add ,vhost after model, and it will locate
   host network interface specified earlier and attach to it.
   This should have been clear from running  qemu with -help
   flag. Could you please suggest how can that text
   be clarified?
 
 I updated to your latest git trees and the default user-space tap backend 
 using the
 following -net options worked fine.
 -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio
 
 But i could not get vhost to work with either raw or tap backends.
 I tried the following combinations.
 1) -net raw,ifname=eth0 -net nic,model=virtio,vhost
 2) -net raw,ifname=vnet0, -net nic,model=virtio,vhost
 3) -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio,vhost
 
 They all failed with the following error
 vhost_net_init returned -7
 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when
 vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in
 vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD
 enabled in the host kernel.
 
 Are all the above -net options supposed to work?
 
 In your descriptions, you say that checksum/tso offload is not supported.

They should work with tap but not raw sockets yet.

 Isn't it
 possible to send/receive large packets without checksum using AF_PACKET 
 sockets if
 the attached interface supports these offloads.
 Do you see the same offload issue even when using tap backend via vhost?
 
 Thanks
 Sridhar
 
 
 
 
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vhost-net patches

2009-10-20 Thread Michael S. Tsirkin
On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote:
 Hello Michael,
 
 They all failed with the following error
 vhost_net_init returned -7
 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when
 vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in
 vhost_virtqueue_init(). Haven't yet debugged further. I have CONFIG_EVENTFD
 enabled in the host kernel.
 
 From the debug output, looks like the vnet-vector is not defined,

what is vnet-vector?
And what do you mean by not defined?

 and the
 default msix_entries_nr is 3, so it returned EINVAL from virtio_pci_irqfd.
 Looks we need to either disable QEMU_PCI_CAP_MSIX or define vector in QEMU
 configuration?

You shouldn't have to do anything.

 I am not familiar with MSIX stuffs.
 
 Thanks
 Shirley
 
 
 Inactive hide details for sri---10/19/2009 03:56:57 PM---On Sun, 2009-10-18 at
 19:32 +0200, Michael S. Tsirkin wrote:sri---10/19/2009 03:56:57 PM---On Sun,
 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:
 
 s...@linux.vnet.ibm.com [cid]   *
 To Michael S. Tsirkin 
 m...@redhat.com,
 10/19/2009 03:56 PMkvm@vger.kernel.org
[cid]   *
 cc David 
 Stevens/Beaverton/i...@ibmus, Shirley Ma/
Beaverton/i...@ibmus
[cid]   *
Subject Re: vhost-net patches
**
 
 On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:
  On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote:
   On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote:
Hi Michael,
   
We are trying out your vhost-net patches from your git trees on
 kernel.org.
I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for qemu.
   
I am using the following qemu script to start the guest using userspace
 tap backend.
   
home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 /home/
 sridhar/kvm_images/fedora10-1-vm -m 512 -drive file=/home/sridhar/kvm_images/
 fedora10-1-vm,if=virtio,index=0,boot=on -net nic,macaddr=
 54:52:00:35:e3:73,model=virtio -net tap,ifname=vnet0,script=no,downscript=no
   
Now that i got the default backend to work, i wanted to try vhost in
 kernel. But
could not figure out the right -net option to pass to qemu.
   
Can you let me know the right syntax to start a guest using vhost.
   
Thanks
Sridhar
  
   Here's an example with raw socket:
  
   /root/kvm-test/bin/qemu-system-x86_64 -m 1G -kernel \
   /boot/vmlinuz-$release -append \
   'root=UUID=d5d2d201-d086-42ad-bb1d-32fbe40eda71 ro quiet nosplash \
   console=tty0 console=ttyS0,9600n8' -initrd /boot/guest-initrd.img \
   $HOME/disk.raw.copy -net raw,ifname=eth3 -net nic,model=virtio,vhost \
   -balloon none -redir tcp:8023::22
  
   As you see, I changed the command line.
   You now simply add ,vhost after model, and it will locate
   host network interface specified earlier and attach to it.
   This should have been clear from running  qemu with -help
   flag. Could you please suggest how can that text
   be clarified?
 
 I updated to your latest git trees and the default user-space tap backend 
 using
 the
 following -net options worked fine.
 -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio
 
 But i could not get vhost to work with either raw or tap backends.
 I tried the following combinations.
 1) -net raw,ifname=eth0 -net nic,model=virtio,vhost
 2) -net raw,ifname=vnet0, -net nic,model=virtio,vhost
 3) -net tap,ifname=vnet0,script=no,downscript=no -net nic,model=virtio,vhost

Yes, should work.

 
 They all failed with the following error
vhost_net_init returned -7
 This is an error message from hw/virtio-net.c:virtio_net_driver_ok() when
 vhost_net_start() fails. It looks like dev-binding-irqfd() is failing in
 vhost_virtqueue_init().

what parameters are passed in?

 Haven't yet debugged further.

this calls into virtio_pci_irqfd.

 I have CONFIG_EVENTFD
 enabled in the host kernel.

Note you need to also enable eventfd support under kvm menu.

 Are all the above -net options supposed to work?
 
 In your descriptions, you say that checksum/tso offload is not supported. 
 Isn't
 it
 possible to send/receive large packets without checksum using AF_PACKET 
 sockets
 if
 the attached interface supports these offloads.
 Do you see the same offload issue even when using tap backend via vhost?
 
 Thanks
 Sridhar
 
 
 
 
 
 
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm-kmod: Use the main development tree of kvm as Linux submodule

2009-10-20 Thread wolfgang . mauerer
From: Wolfgang Mauerer wolfgang.maue...@siemens.com

Most people won't have the sources installed in the path
that is the current default setting.

Signed-off-by: Wolfgang Mauerer wolfgang.maue...@siemens.com
---
 .gitmodules |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/.gitmodules b/.gitmodules
index 9c63921..42fc7a1 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +1,3 @@
 [submodule linux-2.6]
path = linux-2.6
-   url = ../kvm.git
+   url = git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
-- 
1.6.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm-kmod: Document the build process

2009-10-20 Thread wolfgang . mauerer
From: Wolfgang Mauerer wolfgang.maue...@siemens.com

A package without build instructions is like a kernel
without a penguin.

Signed-off-by: Wolfgang Mauerer wolfgang.maue...@siemens.com
---
 README |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)
 create mode 100644 README

diff --git a/README b/README
new file mode 100644
index 000..40a72d3
--- /dev/null
+++ b/README
@@ -0,0 +1,26 @@
+Building the KVM kernel module is performed differently depending on whether
+you are working from a clone of the git repository or from a source release.
+
+- To build from a release, simply use ./configure (possibly with any
+  arguments that are required for your setup, see ./configure --help)
+  and make.
+
+- Building from a cloned git repository requires a kernel tree with the main
+  kvm sources that is included as a submodule in the linux-2.6/ directory.  By
+  default, the KVM development tree on git.kernel.org is used, but you can
+  change this setting in .gitmodules
+
+  Before the kvm module can be built, the linux submodule must be initialised 
+  and populated. The required sequence of commands is
+
+  git submodule init
+  git submodule update
+  ./configure
+  make sync
+  make
+
+  Notice that you can also specify an existing Linux tree for the
+  synchronisation stage by using
+
+  make sync LINUX=/path/to/tree
+
-- 
1.6.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run

2009-10-20 Thread Marcelo Tosatti

GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value
changes.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 364263a..325075f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
 
-   if (enable_ept  is_paging(vcpu)) {
-   vmcs_writel(GUEST_CR3, vcpu-arch.cr3);
+   if (enable_ept  is_paging(vcpu))
ept_load_pdptrs(vcpu);
-   }
+
/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis()  vmx-soft_vnmi_blocked))
vmx-entry_time = ktime_get();
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 06:14:04PM +0900, Avi Kivity wrote:
 On 10/20/2009 06:08 PM, Gleb Natapov wrote:
 On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote:

 On 10/20/2009 05:56 PM, Jan Kiszka wrote:
  
 So save/restore kvm_vcpu_arch::exception? As another substate or as part
 of a generalized NMI substate?

 Yes.  It's not part of an nmi substate, but both can be part of an
 exception substate (but need to look at the docs vewy cawefuwy to
 make sure we don't screw up again).

  
 What do you mean? How they can be both part of exception substate?



 Sorry, nomenclature failure.  We need NMI state, Interrupt state  
 (already provided), and pending exception state (which can be a fault or  
 a trap).  There's also some extra state associated with pending debug  
 exceptions (maybe we can copy it into dr6).

KVM_REQ_TRIPLE_FAULT can also be lost, but i don't think anybody cares?


 We can either put all of these into one substate, or into separate  
 substates.  I'm not sure which is best.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] KVM: x86: Add VCPU substate for NMI states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 09:13:02AM -0200, Marcelo Tosatti wrote:
 On Tue, Oct 20, 2009 at 06:14:04PM +0900, Avi Kivity wrote:
  On 10/20/2009 06:08 PM, Gleb Natapov wrote:
  On Tue, Oct 20, 2009 at 06:06:36PM +0900, Avi Kivity wrote:
 
  On 10/20/2009 05:56 PM, Jan Kiszka wrote:
   
  So save/restore kvm_vcpu_arch::exception? As another substate or as part
  of a generalized NMI substate?
 
  Yes.  It's not part of an nmi substate, but both can be part of an
  exception substate (but need to look at the docs vewy cawefuwy to
  make sure we don't screw up again).
 
   
  What do you mean? How they can be both part of exception substate?
 
 
 
  Sorry, nomenclature failure.  We need NMI state, Interrupt state  
  (already provided), and pending exception state (which can be a fault or  
  a trap).  There's also some extra state associated with pending debug  
  exceptions (maybe we can copy it into dr6).
 
 KVM_REQ_TRIPLE_FAULT can also be lost, but i don't think anybody cares?
 
If pending exception will be migrated KVM_REQ_TRIPLE_FAULT will be restored
after guest will try to re-execute instruction that caused it. One more
reason to migrate pending exceptions. And why not migrate
KVM_REQ_TRIPLE_FAULT while we are at it.

 
  We can either put all of these into one substate, or into separate  
  substates.  I'm not sure which is best.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile ATM,
this is an attempt to collect the precise requirements for additional
state fields. Once everyone feels the list is complete, we can decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile ATM,
this is an attempt to collect the precise requirements for additional
state fields. Once everyone feels the list is complete, we can decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
sync it.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run

2009-10-20 Thread Avi Kivity

On 10/20/2009 09:37 PM, Marcelo Tosatti wrote:

GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value
changes.

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 364263a..325075f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
  {
struct vcpu_vmx *vmx = to_vmx(vcpu);

-   if (enable_ept  is_paging(vcpu)) {
-   vmcs_writel(GUEST_CR3, vcpu-arch.cr3);
+   if (enable_ept  is_paging(vcpu))
ept_load_pdptrs(vcpu);
-   }
+
/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis()  vmx-soft_vnmi_blocked))
vmx-entry_time = ktime_get();
   


Nice.  Any reason why ept_load_pdptrs() couldn't go the same way?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Avi Kivity

On 10/20/2009 07:17 PM, Tomasz Chmielewski wrote:

Avi Kivity wrote:

On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote:

On a 8 CPU host, I created a guest with 4 CPUs (-smp 4).

Unfortunately, the guest only uses one host CPU.
For example, running cat /dev/urandom | gzip -9 /dev/null  
several times on this guest causes load on only one host CPU.


Is it expected?


No.  What does 'top -H' show?


In the guest - 4 CPUs with ~100% usage each (when I press 1), 
otherwise, in the task list, multiple cat processes taking most CPU 
time (as it reads from /dev/urandom).



In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; 
when I press 1, I see only one CPU is used 100%, 7 other CPUs are 
more or less not used.




I meant, how many qemu threads are there, and how much cpu does each take?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-20 Thread Michael Goldish
See comments below.

- Dor Laor dl...@redhat.com wrote:

 On 10/15/2009 11:48 AM, Amos Kong wrote:
 
  Test 802.1Q vlan of nic, config it by vconfig command.
 1) Create two VMs
 2) Setup guests in different vlan by vconfig and test
 communication by ping
using hard-coded ip address
 3) Setup guests in same vlan and test communication by ping
 4) Recover the vlan config
 
  Signed-off-by: Amos Kongak...@redhat.com
  ---
client/tests/kvm/kvm_tests.cfg.sample |6 +++
client/tests/kvm/tests/vlan_tag.py|   73
 +
2 files changed, 79 insertions(+), 0 deletions(-)
mode change 100644 =  100755 client/tests/kvm/scripts/qemu-ifup
 
 In general the above should come as an independent patch.
 
create mode 100644 client/tests/kvm/tests/vlan_tag.py
 
  diff --git a/client/tests/kvm/kvm_tests.cfg.sample
 b/client/tests/kvm/kvm_tests.cfg.sample
  index 9ccc9b5..4e47767 100644
  --- a/client/tests/kvm/kvm_tests.cfg.sample
  +++ b/client/tests/kvm/kvm_tests.cfg.sample
  @@ -166,6 +166,12 @@ variants:
used_cpus = 5
used_mem = 2560
 
  +- vlan_tag:  install setup
  +type = vlan_tag
  +subnet2 = 192.168.123
  +vlans = 10 20
 
 If we want to be fanatic and safe we should dynamically choose subnet
 and vlans numbers that are not used on the host instead of hard code
 it.

For the sake of safety maybe we should start both VMs with -snapshot.
Dor, what do you think?  Is it safe to start 2 VMs with the same disk image
when only one of them uses -snapshot?

  +nic_mode = tap
  +nic_model = e1000
 
 Why only e1000? Let's test virtio and rtl8139 as well. Can't you
 inherit the nic model from the config?

It's not just inherited, it's overwritten, because nic_model is defined
later in the file in a variants block.  So this nic_model line has no
effect.

 
- autoit:   install setup
type = autoit
  diff --git a/client/tests/kvm/scripts/qemu-ifup
 b/client/tests/kvm/scripts/qemu-ifup
  old mode 100644
  new mode 100755
  diff --git a/client/tests/kvm/tests/vlan_tag.py
 b/client/tests/kvm/tests/vlan_tag.py
  new file mode 100644
  index 000..15e763f
  --- /dev/null
  +++ b/client/tests/kvm/tests/vlan_tag.py
  @@ -0,0 +1,73 @@
  +import logging, time
  +from autotest_lib.client.common_lib import error
  +import kvm_subprocess, kvm_test_utils, kvm_utils
  +
  +def run_vlan_tag(test, params, env):
  +
  +Test 802.1Q vlan of nic, config it by vconfig command.
  +
  +1) Create two VMs
  +2) Setup guests in different vlan by vconfig and test
 communication by ping
  +   using hard-coded ip address
  +3) Setup guests in same vlan and test communication by ping
  +4) Recover the vlan config
  +
  +@param test: Kvm test object
  +@param params: Dictionary with the test parameters.
  +@param env: Dictionary with test environment.
  +
  +
  +vm = []
  +session = []
  +subnet2 = params.get(subnet2)
  +vlans = params.get(vlans).split()
  +
  +vm.append(kvm_test_utils.get_living_vm(env, %s % 
  params.get(main_vm)))

There's no need for the %s here.
...get_living_vm(env, params.get(main_vm))) should work.

  +params_vm2 = params.copy()
  +params_vm2['image_snapshot'] = yes
  +params_vm2['kill_vm_gracefully'] = no
  +params_vm2[address_index] = int(params.get(address_index, 0))+1
  +vm.append(vm[0].clone(vm2, params_vm2))
  +kvm_utils.env_register_vm(env, vm2, vm[1])
  +if not vm[1].create():
  +raise error.TestError(VM 1 create faild)
 
 
 The whole 7-8 lines above should be grouped as a function to clone 
 existing VM. It should be part of kvm autotest infrastructure.
 Besides that, it looks good.

There's already a clone function and it's being used here.

Instead of those 7-8 lines, why not just define the VM in the config file?
It looks like you're always using 2 VMs so there's no reason to do this in
test code.  This should do what you want:

- vlan_tag:  install setup
type = vlan_tag
subnet2 = 192.168.123
vlans = 10 20
nic_mode = tap
vms +=  vm2
extra_params_vm2 +=  -snapshot
kill_vm_gracefully_vm2 = no
address_index_vm2 = 1

The preprocessor then automatically creates vm2 and registers it in env.
To use it in the test just do:

vm.append(kvm_test_utils.get_living_vm(env, vm2))

You can also use a parameter that tells the test which VM to use if you don't
want the name vm2 hardcoded into the test.
Add something like this to the config file:

2nd_vm = vm2

and in the test use params.get(2nd_vm) instead of vm2 (just like you use
main_vm).

  +
  +for i in range(2):
  +session.append(kvm_test_utils.wait_for_login(vm[i]))
  +
  +try:
  +vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s
  +# Attempt to configure IPs for the VMs and record the
 results in
  +# boolean variables
  +# 

Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,

 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.

 What I read so far (or tried to patch already):

 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

 Unclear points (for me) from the last discussion:

 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)

 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
 sync it.

OK. Whole hflags or just the GIF bit?

If we allow access to all bits, can user space cause any problems
(beyond screwing up its guests) by passing weird patterns?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Tomasz Chmielewski

Avi Kivity wrote:

On 10/20/2009 07:17 PM, Tomasz Chmielewski wrote:

Avi Kivity wrote:

On 10/20/2009 06:03 PM, Tomasz Chmielewski wrote:

On a 8 CPU host, I created a guest with 4 CPUs (-smp 4).

Unfortunately, the guest only uses one host CPU.
For example, running cat /dev/urandom | gzip -9 /dev/null  
several times on this guest causes load on only one host CPU.


Is it expected?


No.  What does 'top -H' show?


In the guest - 4 CPUs with ~100% usage each (when I press 1), 
otherwise, in the task list, multiple cat processes taking most CPU 
time (as it reads from /dev/urandom).



In the host - qemu-system-x86 (one process/thread) taking ~100% CPU; 
when I press 1, I see only one CPU is used 100%, 7 other CPUs are 
more or less not used.




I meant, how many qemu threads are there, and how much cpu does each take?


There is only one qemu thread for the 4-cpu guest.


--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix up vmx_set_segment for booting older guests.

2009-10-20 Thread Avi Kivity

On 10/20/2009 07:02 PM, Chris Lalancette wrote:

get_desc_base() sign-extends because of some complicated u8 to unsigned rules
that I'm not completely sure of.  The below patch fixes my original issue, but
I'm not at all sure that this is the right thing to do.  I could also change
get_desc_base() itself to do the casting, which should do the right thing for
all callers, but I'm not sure if that's what all callers want.  Anybody else
have an opinion?
   


get_desc_base() is broken and should be fixed.  No caller could possibly 
want this sign extension (64-bit segment bases are only possible using 
MSR_FS_BASE/MSR_GS_BASE/MSR_KERNEL_GS_BASE).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Avi Kivity

On 10/20/2009 10:19 PM, Tomasz Chmielewski wrote:


I meant, how many qemu threads are there, and how much cpu does each 
take?



There is only one qemu thread for the 4-cpu guest.


Not possible.  Even a single-cpu guest has two threads.

What does 'ls /proc/$(pgrep qemu)/task' show?


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
 Alexander Graf wrote:
  On 20.10.2009, at 15:01, Jan Kiszka wrote:
  
  Hi all,
 
  as the list of yet user-unaccessible x86 states is a bit volatile ATM,
  this is an attempt to collect the precise requirements for additional
  state fields. Once everyone feels the list is complete, we can decide
  how to partition it into one ore more substates for the new
  KVM_GET/SET_VCPU_STATE interface.
 
  What I read so far (or tried to patch already):
 
  - nmi_masked
  - nmi_pending
  - nmi_injected
  - kvm_queued_exception (whole struct content)
  - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
  Unclear points (for me) from the last discussion:
 
  - sipi_vector
  - MCE (covered via kvm_queued_exception, or does it require more?)
 
  Please extend or correct the list as required.
  
  hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
  sync it.
 
 OK. Whole hflags or just the GIF bit?
 
 If we allow access to all bits, can user space cause any problems
 (beyond screwing up its guests) by passing weird patterns?
 
HF_NMI_MASK should be migrated too. Destination should enable IRET intercept if
HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
will never happen :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:19, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile  
ATM,
this is an attempt to collect the precise requirements for  
additional
state fields. Once everyone feels the list is complete, we can  
decide

how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
sync it.


OK. Whole hflags or just the GIF bit?


ag...@busu:~/git/kvm grep -R HF_ arch/x86/include/asm/*kvm*
arch/x86/include/asm/kvm_host.h:#define HF_GIF_MASK (1  0)
arch/x86/include/asm/kvm_host.h:#define HF_HIF_MASK (1  1)
arch/x86/include/asm/kvm_host.h:#define HF_VINTR_MASK   (1  2)
arch/x86/include/asm/kvm_host.h:#define HF_NMI_MASK (1  3)
arch/x86/include/asm/kvm_host.h:#define HF_IRET_MASK(1  4)

I can only talk for GIF here and that should be fine. Not knowing  
about the others does seem like we could get race conditions though.



If we allow access to all bits, can user space cause any problems
(beyond screwing up its guests) by passing weird patterns?


IMHO the hflags should be converted between userspace and kernel  
representation. There's a good chance we run older userspace that  
doesn't know about certain flags yet and I'd like to keep the bits as  
flexible as possible.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Gleb Natapov wrote:
 On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
 Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:

 Hi all,

 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.

 What I read so far (or tried to patch already):

 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

 Unclear points (for me) from the last discussion:

 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)

 Please extend or correct the list as required.
 hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
 sync it.
 OK. Whole hflags or just the GIF bit?

 If we allow access to all bits, can user space cause any problems
 (beyond screwing up its guests) by passing weird patterns?

 HF_NMI_MASK should be migrated too. Destination should enable IRET intercept 
 if
 HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
 will never happen :)

HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would
therefore likely be masked out.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:29:38PM +0200, Jan Kiszka wrote:
 Gleb Natapov wrote:
  On Tue, Oct 20, 2009 at 03:19:41PM +0200, Jan Kiszka wrote:
  Alexander Graf wrote:
  On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
  Hi all,
 
  as the list of yet user-unaccessible x86 states is a bit volatile ATM,
  this is an attempt to collect the precise requirements for additional
  state fields. Once everyone feels the list is complete, we can decide
  how to partition it into one ore more substates for the new
  KVM_GET/SET_VCPU_STATE interface.
 
  What I read so far (or tried to patch already):
 
  - nmi_masked
  - nmi_pending
  - nmi_injected
  - kvm_queued_exception (whole struct content)
  - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
  Unclear points (for me) from the last discussion:
 
  - sipi_vector
  - MCE (covered via kvm_queued_exception, or does it require more?)
 
  Please extend or correct the list as required.
  hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
  sync it.
  OK. Whole hflags or just the GIF bit?
 
  If we allow access to all bits, can user space cause any problems
  (beyond screwing up its guests) by passing weird patterns?
 
  HF_NMI_MASK should be migrated too. Destination should enable IRET 
  intercept if
  HF_NMI_MASK is set. Or we can assume that migration in the middle of NMI
  will never happen :)
 
 HF_NMI_MASK is redundant to the vendor-agnostic nmi_masked and would
 therefore likely be masked out.
 
Correct. We can restore HF_NMI_MASK from nmi_masked.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
Should be migrated.

 - MCE (covered via kvm_queued_exception, or does it require more?)
 
 Please extend or correct the list as required.
 
 Jan
 
 -- 
 Siemens AG, Corporate Technology, CT SE 2
 Corporate Competence Center Embedded Linux

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] v3: use upstream kvm_vcpu_ioctl

2009-10-20 Thread Glauber Costa
[v2: we already return -errno, so fix testers ]
[v3: keep error message for apic related failures ]

Signed-off-by: Glauber Costa glom...@redhat.com
---
 kvm-all.c  |3 --
 qemu-kvm-x86.c |   90 +--
 qemu-kvm.c |   31 ---
 qemu-kvm.h |1 +
 4 files changed, 48 insertions(+), 77 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 0a8aa4c..50cd1fb 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -863,7 +863,6 @@ int kvm_vm_ioctl(KVMState *s, int type, ...)
 return ret;
 }
 
-#ifdef KVM_UPSTREAM
 int kvm_vcpu_ioctl(CPUState *env, int type, ...)
 {
 int ret;
@@ -881,8 +880,6 @@ int kvm_vcpu_ioctl(CPUState *env, int type, ...)
 return ret;
 }
 
-#endif
-
 int kvm_has_sync_mmu(void)
 {
 #ifdef KVM_CAP_SYNC_MMU
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index fb70ede..09e4f8c 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -174,18 +174,11 @@ static int kvm_handle_tpr_access(CPUState *env)
 
 int kvm_enable_vapic(CPUState *env, uint64_t vapic)
 {
-   int r;
struct kvm_vapic_addr va = {
.vapic_addr = vapic,
};
 
-   r = ioctl(env-kvm_fd, KVM_SET_VAPIC_ADDR, va);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_enable_vapic);
-   return r;
-   }
-   return 0;
+   return kvm_vcpu_ioctl(env, KVM_SET_VAPIC_ADDR, va);
 }
 
 #endif
@@ -283,28 +276,29 @@ int kvm_destroy_memory_alias(kvm_context_t kvm, uint64_t 
phys_start)
 
 int kvm_get_lapic(CPUState *env, struct kvm_lapic_state *s)
 {
-   int r;
+int r = 0;
+
if (!kvm_irqchip_in_kernel())
-   return 0;
-   r = ioctl(env-kvm_fd, KVM_GET_LAPIC, s);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_get_lapic);
-   }
-   return r;
+   return r;
+
+   r = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, s);
+if (r  0)
+fprintf(stderr, KVM_GET_LAPIC failed\n)
+return r;
 }
 
 int kvm_set_lapic(CPUState *env, struct kvm_lapic_state *s)
 {
-   int r;
+int r = 0;
+
if (!kvm_irqchip_in_kernel())
return 0;
-   r = ioctl(env-kvm_fd, KVM_SET_LAPIC, s);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_set_lapic);
-   }
-   return r;
+
+   r = kvm_vcpu_ioctl(env, KVM_SET_LAPIC, s);
+
+if (r  0)
+fprintf(stderr, KVM_SET_LAPIC failed\n)
+return r;
 }
 
 #endif
@@ -356,7 +350,6 @@ int kvm_has_pit_state2(kvm_context_t kvm)
 void kvm_show_code(CPUState *env)
 {
 #define SHOW_CODE_LEN 50
-   int fd = env-kvm_fd;
struct kvm_regs regs;
struct kvm_sregs sregs;
int r, n;
@@ -365,13 +358,13 @@ void kvm_show_code(CPUState *env)
char code_str[SHOW_CODE_LEN * 3 + 1];
unsigned long rip;
 
-   r = ioctl(fd, KVM_GET_SREGS, sregs);
-   if (r == -1) {
+   r = kvm_vcpu_ioctl(env, KVM_GET_SREGS, sregs);
+   if (r  0 ) {
perror(KVM_GET_SREGS);
return;
}
-   r = ioctl(fd, KVM_GET_REGS, regs);
-   if (r == -1) {
+   r = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs);
+   if (r  0) {
perror(KVM_GET_REGS);
return;
}
@@ -420,29 +413,25 @@ struct kvm_msr_list *kvm_get_msr_list(kvm_context_t kvm)
 int kvm_get_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n)
 {
 struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs);
-int r, e;
+int r;
 
 kmsrs-nmsrs = n;
 memcpy(kmsrs-entries, msrs, n * sizeof *msrs);
-r = ioctl(env-kvm_fd, KVM_GET_MSRS, kmsrs);
-e = errno;
+r = kvm_vcpu_ioctl(env, KVM_GET_MSRS, kmsrs);
 memcpy(msrs, kmsrs-entries, n * sizeof *msrs);
 free(kmsrs);
-errno = e;
 return r;
 }
 
 int kvm_set_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n)
 {
 struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs);
-int r, e;
+int r;
 
 kmsrs-nmsrs = n;
 memcpy(kmsrs-entries, msrs, n * sizeof *msrs);
-r = ioctl(env-kvm_fd, KVM_SET_MSRS, kmsrs);
-e = errno;
+r = kvm_vcpu_ioctl(env, KVM_SET_MSRS, kmsrs);
 free(kmsrs);
-errno = e;
 return r;
 }
 
@@ -464,7 +453,7 @@ int kvm_get_mce_cap_supported(kvm_context_t kvm, uint64_t 
*mce_cap,
 int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap)
 {
 #ifdef KVM_CAP_MCE
-return ioctl(env-kvm_fd, KVM_X86_SETUP_MCE, mcg_cap);
+return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap);
 #else
 return -ENOSYS;
 #endif
@@ -473,7 +462,7 @@ int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap)
 int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m)
 {
 #ifdef KVM_CAP_MCE
-return ioctl(env-kvm_fd, KVM_X86_SET_MCE, m);
+return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m);
 #else
 return -ENOSYS;
 #endif
@@ -496,13 +485,12 @@ static void print_dt(FILE *file, const char *name, struct 
kvm_dtable *dt)
 
 void kvm_show_regs(CPUState 

Re: List of unaccessible x86 states

2009-10-20 Thread Jan Kiszka
Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,

 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.

 What I read so far (or tried to patch already):

 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

 Unclear points (for me) from the last discussion:

 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)

 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to  
 sync it.

BTW, GIF is related to svm nesting, right?

Orit, are there any additional states arriving on the vmx side as well
with your nesting patches?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH] Test 802.1Q vlan of nic

2009-10-20 Thread Lucas Meneghel Rodrigues
On Tue, Oct 20, 2009 at 11:19 AM, Michael Goldish mgold...@redhat.com wrote:
 See comments below.

 - Dor Laor dl...@redhat.com wrote:

 On 10/15/2009 11:48 AM, Amos Kong wrote:
 
  Test 802.1Q vlan of nic, config it by vconfig command.
     1) Create two VMs
     2) Setup guests in different vlan by vconfig and test
 communication by ping
        using hard-coded ip address
     3) Setup guests in same vlan and test communication by ping
     4) Recover the vlan config
 
  Signed-off-by: Amos Kongak...@redhat.com
  ---
    client/tests/kvm/kvm_tests.cfg.sample |    6 +++
    client/tests/kvm/tests/vlan_tag.py    |   73
 +
    2 files changed, 79 insertions(+), 0 deletions(-)
    mode change 100644 =  100755 client/tests/kvm/scripts/qemu-ifup

 In general the above should come as an independent patch.

    create mode 100644 client/tests/kvm/tests/vlan_tag.py
 
  diff --git a/client/tests/kvm/kvm_tests.cfg.sample
 b/client/tests/kvm/kvm_tests.cfg.sample
  index 9ccc9b5..4e47767 100644
  --- a/client/tests/kvm/kvm_tests.cfg.sample
  +++ b/client/tests/kvm/kvm_tests.cfg.sample
  @@ -166,6 +166,12 @@ variants:
            used_cpus = 5
            used_mem = 2560
 
  +    - vlan_tag:  install setup
  +        type = vlan_tag
  +        subnet2 = 192.168.123
  +        vlans = 10 20

 If we want to be fanatic and safe we should dynamically choose subnet
 and vlans numbers that are not used on the host instead of hard code
 it.

 For the sake of safety maybe we should start both VMs with -snapshot.
 Dor, what do you think?  Is it safe to start 2 VMs with the same disk image
 when only one of them uses -snapshot?

  +        nic_mode = tap
  +        nic_model = e1000

 Why only e1000? Let's test virtio and rtl8139 as well. Can't you
 inherit the nic model from the config?

 It's not just inherited, it's overwritten, because nic_model is defined
 later in the file in a variants block.  So this nic_model line has no
 effect.

 
        - autoit:       install setup
            type = autoit
  diff --git a/client/tests/kvm/scripts/qemu-ifup
 b/client/tests/kvm/scripts/qemu-ifup
  old mode 100644
  new mode 100755
  diff --git a/client/tests/kvm/tests/vlan_tag.py
 b/client/tests/kvm/tests/vlan_tag.py
  new file mode 100644
  index 000..15e763f
  --- /dev/null
  +++ b/client/tests/kvm/tests/vlan_tag.py
  @@ -0,0 +1,73 @@
  +import logging, time
  +from autotest_lib.client.common_lib import error
  +import kvm_subprocess, kvm_test_utils, kvm_utils
  +
  +def run_vlan_tag(test, params, env):
  +    
  +    Test 802.1Q vlan of nic, config it by vconfig command.
  +
  +    1) Create two VMs
  +    2) Setup guests in different vlan by vconfig and test
 communication by ping
  +       using hard-coded ip address
  +    3) Setup guests in same vlan and test communication by ping
  +    4) Recover the vlan config
  +
  +   �...@param test: Kvm test object
  +   �...@param params: Dictionary with the test parameters.
  +   �...@param env: Dictionary with test environment.
  +    
  +
  +    vm = []
  +    session = []
  +    subnet2 = params.get(subnet2)
  +    vlans = params.get(vlans).split()
  +
  +    vm.append(kvm_test_utils.get_living_vm(env, %s % 
  params.get(main_vm)))

 There's no need for the %s here.
 ...get_living_vm(env, params.get(main_vm))) should work.

  +    params_vm2 = params.copy()
  +    params_vm2['image_snapshot'] = yes
  +    params_vm2['kill_vm_gracefully'] = no
  +    params_vm2[address_index] = int(params.get(address_index, 0))+1
  +    vm.append(vm[0].clone(vm2, params_vm2))
  +    kvm_utils.env_register_vm(env, vm2, vm[1])
  +    if not vm[1].create():
  +        raise error.TestError(VM 1 create faild)


 The whole 7-8 lines above should be grouped as a function to clone
 existing VM. It should be part of kvm autotest infrastructure.
 Besides that, it looks good.

 There's already a clone function and it's being used here.

 Instead of those 7-8 lines, why not just define the VM in the config file?
 It looks like you're always using 2 VMs so there's no reason to do this in
 test code.  This should do what you want:

 - vlan_tag:  install setup
    type = vlan_tag
    subnet2 = 192.168.123
    vlans = 10 20
    nic_mode = tap
    vms +=  vm2
    extra_params_vm2 +=  -snapshot
    kill_vm_gracefully_vm2 = no
    address_index_vm2 = 1

 The preprocessor then automatically creates vm2 and registers it in env.
 To use it in the test just do:

 vm.append(kvm_test_utils.get_living_vm(env, vm2))

 You can also use a parameter that tells the test which VM to use if you don't
 want the name vm2 hardcoded into the test.
 Add something like this to the config file:

    2nd_vm = vm2

 and in the test use params.get(2nd_vm) instead of vm2 (just like you use
 main_vm).

  +
  +    for i in range(2):
  +        session.append(kvm_test_utils.wait_for_login(vm[i]))
  +
  +    try:
  +        vconfig_cmd = vconfig add eth0 %s;ifconfig eth0.%s %s.%s
  +        # Attempt 

Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit volatile  
ATM,
this is an attempt to collect the precise requirements for  
additional
state fields. Once everyone feels the list is complete, we can  
decide

how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a  
nested VM. We can just #VMEXIT just before migrating with a  
VMEXIT_INTR intercept.


Now just after #VMEXIT we're in a state that's pure host context, but  
has GIF=0. So we need to know about that in userspace to support  
migration.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Do I set up separate bridges for each guest?

2009-10-20 Thread Neil Aggarwal
Dor:

 The simplest thing is to use a single bridge for all -
 The physical nic should be part of it and supply the outside world 
 connection. The physical nic doesn't need an IP and the bridge should 
 own it. All vms can use this bridge.

I want to assign a static IP to each of the guests,
how would I do that with a single bridge?

Thanks,
Neil

--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system. 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:37, Jan Kiszka wrote:
 
 Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit
 volatile ATM,
 this is an attempt to collect the precise requirements for
 additional
 state fields. Once everyone feels the list is complete, we can
 decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)
 
 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side knows how to
 sync it.
 
 BTW, GIF is related to svm nesting, right?
 
 Yes and no. It's an architecture addition that came with SVM, yes.
 
 The problem is that I don't want to support migrating while in a
Why not?

 nested VM. We can just #VMEXIT just before migrating with a
 VMEXIT_INTR intercept.
 
We don't notify kernel about migration currently. CPU state is migrated
when VM is already paused, how we can exit nested guest at this point?

 Now just after #VMEXIT we're in a state that's pure host context,
 but has GIF=0. So we need to know about that in userspace to support
 migration.
 
 Alex

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 0.11: SMP guests using one host CPU only?

2009-10-20 Thread Tomasz Chmielewski

Avi Kivity wrote:

On 10/20/2009 10:19 PM, Tomasz Chmielewski wrote:


I meant, how many qemu threads are there, and how much cpu does each 
take?



There is only one qemu thread for the 4-cpu guest.


Not possible.  Even a single-cpu guest has two threads.


ps auxH shuld show me all threads? I started it multiple times, and it shown 
1 thread for the 4-CPU guest
(with no CPU intensive tasks running - could this be a reason?).



What does 'ls /proc/$(pgrep qemu)/task' show?


Running several CPU-intensive processes on this guest uses only one CPU on the 
host.

Both ps auxH and /proc confirm that this guest has 4-5 threads when I run 
several CPU-intensive apps.

Only one thread for this guest uses 100% CPU time; other threads use ~0%.

If I don't run any CPU-intensive tasks on this guests, it only runs one thread 
(unless I misinterpret something here).


Some 1-CPU guests have only one thread though?


# QEMU_TASKS=$(pgrep qemu)

# for QEMU_TASK in $QEMU_TASKS; do cat /proc/$QEMU_TASK/cmdline ; echo ; ls 
/proc/$QEMU_TASK/task ; echo ; done
/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F-nettap,vlan=0,script=/etc/qemu-ifup-localtime-smp4
17687/  19018/  19020/  19069/

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster1a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3A-nettap,vlan=0,script=/etc/qemu-ifup-localtime
19220/  24857/

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster2a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3B-nettap,vlan=0,script=/etc/qemu-ifup-localtime
19252/  24896/

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster3a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3C-nettap,vlan=0,script=/etc/qemu-ifup-localtime
19258/  24934/

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/gluster4a,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3D-nettap,vlan=0,script=/etc/qemu-ifup-localtime
25878/

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs1,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3E-nettap,vlan=0,script=/etc/qemu-ifup-localtime
25920/



No CPU-intensive apps:

/usr/bin/qemu-system-x86_64-m1024-drivefile=/srv/kvm/images/lvs2,if=virtio,cache=writeback,index=0,boot=on-netnic,vlan=0,model=virtio,macaddr=F2:4A:51:41:B1:3F-nettap,vlan=0,script=/etc/qemu-ifup-localtime-smp4
17687/



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side knows  
how to

sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the  
merged intercept bitmaps to userspace as well. That's just too many  
internals to expose IMHO.



nested VM. We can just #VMEXIT just before migrating with a
VMEXIT_INTR intercept.

We don't notify kernel about migration currently. CPU state is  
migrated

when VM is already paused, how we can exit nested guest at this point?


Hm - introduce a new ioctl? I haven't fully thought it through yet :-).

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sched, cpuacct: fix niced guest time accounting

2009-10-20 Thread Ryota Ozaki
Hi Avi,

This is the patch we discussed earlier. Please review it.

BTW, should this be sent to lkml as well?

Regards,
  ozaki-r

From 8aea0f1a9acc891d1208bc462a05797765451ab4 Mon Sep 17 00:00:00 2001
From: Ryota Ozaki ozaki.ry...@gmail.com
Date: Tue, 20 Oct 2009 22:41:12 +0900
Subject: [PATCH] sched, cpuacct: fix niced guest time accounting

CPU time of a guest is always accounted in 'user' time
without concern for the nice value of its counterpart
process although the guest is scheduled under the nice
value.

This patch fixes the defect and accounts cpu time of
a niced guest in 'nice' time as same as a niced process.

And also the patch adds 'guest_nice' to cpuacct. The
value provides niced guest cpu time which is like 'nice'
to 'user'.

Signed-off-by: Ryota Ozaki ozaki.ry...@gmail.com
---
 Documentation/filesystems/proc.txt |3 ++-
 fs/proc/stat.c |   17 +++--
 include/linux/kernel_stat.h|1 +
 kernel/sched.c |9 +++--
 4 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/Documentation/filesystems/proc.txt
b/Documentation/filesystems/proc.txt
index 2c48f94..4af0018 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1072,7 +1072,8 @@ second).  The meanings of the columns are as
follows, from left to right:
 - irq: servicing interrupts
 - softirq: servicing softirqs
 - steal: involuntary wait
-- guest: running a guest
+- guest: running a normal guest
+- guest_nice: running a niced guest

 The intr line gives counts of interrupts  serviced since boot time, for each
 of the  possible system interrupts.   The first  column  is the  total of  all
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 7cc726c..67c30a7 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -27,7 +27,7 @@ static int show_stat(struct seq_file *p, void *v)
int i, j;
unsigned long jif;
cputime64_t user, nice, system, idle, iowait, irq, softirq, steal;
-   cputime64_t guest;
+   cputime64_t guest, guest_nice;
u64 sum = 0;
u64 sum_softirq = 0;
unsigned int per_softirq_sums[NR_SOFTIRQS] = {0};
@@ -36,7 +36,7 @@ static int show_stat(struct seq_file *p, void *v)

user = nice = system = idle = iowait =
irq = softirq = steal = cputime64_zero;
-   guest = cputime64_zero;
+   guest = guest_nice = cputime64_zero;
getboottime(boottime);
jif = boottime.tv_sec;

@@ -51,6 +51,8 @@ static int show_stat(struct seq_file *p, void *v)
softirq = cputime64_add(softirq, kstat_cpu(i).cpustat.softirq);
steal = cputime64_add(steal, kstat_cpu(i).cpustat.steal);
guest = cputime64_add(guest, kstat_cpu(i).cpustat.guest);
+   guest_nice = cputime64_add(guest_nice,
+   kstat_cpu(i).cpustat.guest_nice);
for_each_irq_nr(j) {
sum += kstat_irqs_cpu(j, i);
}
@@ -65,7 +67,7 @@ static int show_stat(struct seq_file *p, void *v)
}
sum += arch_irq_stat();

-   seq_printf(p, cpu  %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n,
+   seq_printf(p, cpu  %llu %llu %llu %llu %llu %llu %llu %llu %llu 
%llu¥n,
(unsigned long long)cputime64_to_clock_t(user),
(unsigned long long)cputime64_to_clock_t(nice),
(unsigned long long)cputime64_to_clock_t(system),
@@ -74,7 +76,8 @@ static int show_stat(struct seq_file *p, void *v)
(unsigned long long)cputime64_to_clock_t(irq),
(unsigned long long)cputime64_to_clock_t(softirq),
(unsigned long long)cputime64_to_clock_t(steal),
-   (unsigned long long)cputime64_to_clock_t(guest));
+   (unsigned long long)cputime64_to_clock_t(guest),
+   (unsigned long long)cputime64_to_clock_t(guest_nice));
for_each_online_cpu(i) {

/* Copy values here to work around gcc-2.95.3, gcc-2.96 */
@@ -88,8 +91,9 @@ static int show_stat(struct seq_file *p, void *v)
softirq = kstat_cpu(i).cpustat.softirq;
steal = kstat_cpu(i).cpustat.steal;
guest = kstat_cpu(i).cpustat.guest;
+   guest_nice = kstat_cpu(i).cpustat.guest_nice;
seq_printf(p,
-   cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu¥n,
+   cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu 
%llu¥n,
i,
(unsigned long long)cputime64_to_clock_t(user),
(unsigned long long)cputime64_to_clock_t(nice),
@@ -99,7 +103,8 @@ static int show_stat(struct seq_file *p, void *v)
(unsigned long long)cputime64_to_clock_t(irq),
(unsigned long long)cputime64_to_clock_t(softirq),
(unsigned long long)cputime64_to_clock_t(steal),
- 

Interface is requiring IP address even though it is for a bridge

2009-10-20 Thread Neil Aggarwal
Hello:

I am trying to follow the RHEL virtualization guide
to set up a bridge on a system running CentOS 5.4.

I copied my ifcfg-eth0 to ifcfg-eth0:1 and
set its content to this:

DEVICE=eth0:1
HWADDR=[The MAC address from eth0]
ONBOOT=yes
BRIDGE=br1

I then created ifcfg-br1 with this content:

DEVICE=br1
TYPE=Bridge
BOOTPROTO=static
BROADCAST=192.168.2.255
IPADDR=192.168.2.202
NETMASK=255.255.255.0
NETWORK=192.168.2.0
ONBOOT=yes
DELAY=0

When I go service network restart, I get this error:

error in ifcfg-eth0:1: didn't specify device or ipaddr

I specified the device so it looks like it wants an IP
address but that is contrary to what I am reading on
the Internet.

Am I supposed to give eth0:1 an IP address?

Thanks,
Neil

--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Further integration with qemu.git

2009-10-20 Thread Marcelo Tosatti
On Mon, Oct 19, 2009 at 11:20:41AM -0200, Glauber Costa wrote:
 A couple of more functions are used from qemu.git.
 Merging keeps going...

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] move tpr stuff to qemu-kvm-x86.c

2009-10-20 Thread Marcelo Tosatti
On Mon, Oct 19, 2009 at 11:29:25AM -0200, Glauber Costa wrote:
 this whole tpr thing does not belong in common code. Move it to i386 specific
 files.

Applied, thanks.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, -next] KVM: x86: Fix 32-bit host build warning

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 02:15:10PM +0200, Jan Kiszka wrote:
 Fixes cast to pointer from integer of different size on 32-bit hosts
 and applies a micro-refactoring.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: VMX: remove GUEST_CR3 write from vmx_vcpu_run

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 10:14:52PM +0900, Avi Kivity wrote:
 On 10/20/2009 09:37 PM, Marcelo Tosatti wrote:
 GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 value
 changes.

 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 364263a..325075f 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3638,10 +3638,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
   {
  struct vcpu_vmx *vmx = to_vmx(vcpu);

 -if (enable_ept  is_paging(vcpu)) {
 -vmcs_writel(GUEST_CR3, vcpu-arch.cr3);
 +if (enable_ept  is_paging(vcpu))
  ept_load_pdptrs(vcpu);
 -}
 +
  /* Record the guest's net vcpu time for enforced NMI injections. */
  if (unlikely(!cpu_has_virtual_nmis()  vmx-soft_vnmi_blocked))
  vmx-entry_time = ktime_get();


 Nice.  Any reason why ept_load_pdptrs() couldn't go the same way?

Its already protected by VCPU_EXREG_PDPTR caching, so it does not buy
much. 

The advantage would symmetry to cr3.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] v3: use upstream kvm_vcpu_ioctl

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote:
 [v2: we already return -errno, so fix testers ]
 [v3: keep error message for apic related failures ]
 
 Signed-off-by: Glauber Costa glom...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] v3: use upstream kvm_vcpu_ioctl

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote:
 [v2: we already return -errno, so fix testers ]
 [v3: keep error message for apic related failures ]
 
 Signed-off-by: Glauber Costa glom...@redhat.com

Dropped, does not compile.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


qemu-kvm: require 4K aligned resource size for memory

2009-10-20 Thread Michael S. Tsirkin
KVM does not virtualize low address bits for memory accesses, so we must
require that PCI BAR size is a multiple of 4K for passthrough to work
(this also guarantees that address is 4K aligned).

Users of recent linux kernels can force resource size up to 4K
using:

commit 32a9a682bef2f6fce7026bd94d1ce20028b0e52d
Author: Yuji Shimada shimada-...@necst.nec.co.jp
Date:   Mon Mar 16 17:13:39 2009 +0900
PCI: allow assignment of memory resources with a specified alignment

Signed-off-by: Michael S. Tsirkin m...@redhat.com

---

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 237060f..c2ef31f 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -381,6 +381,14 @@ static int assigned_dev_register_regions(PCIRegion 
*io_regions,
 int t = cur_region-type  IORESOURCE_PREFETCH
 ? PCI_ADDRESS_SPACE_MEM_PREFETCH
 : PCI_ADDRESS_SPACE_MEM;
+if (cur_region-size  0xFFF) {
+fprintf(stderr, Unable to assign device: PCI region %d 
+at address 0x%llx has size 0x%x, 
+ which is not a multiple of 4K\n,
+i, (unsigned long long)cur_region-base_addr,
+cur_region-size);
+return -1;
+}
 
 /* map physical memory */
 pci_dev-v_addrs[i].e_physbase = cur_region-base_addr;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vhost-net patches

2009-10-20 Thread Michael S. Tsirkin
On Tue, Oct 20, 2009 at 10:14:55AM -0700, Shirley Ma wrote:
 
 Hello Michael,
 
 what is vnet-vector?
 And what do you mean by not defined?
 
 In funcation:
 
 static int vhost_virtqueue_init()
 {
 ..
   r = vdev-binding-irqfd(vdev-binding_opaque, q-vector,
 vq-call);
 ..
 }.
 
 q-vector is 65535,

Thanks for debugging this.
I think this means that guest does not use MSI-X.

You can verify this by booting guest without vhost,
and performing the following command:
 cat /proc/interrupts

Please note that you currently need recent kernel in guest,
so that it uses MSI-X. I plan on implementing regular IRQ,
but not yet, and it will be slower anyway.

 in   static int virtio_pci_irqfd()
 {
 ..
 if (vector = proxy-pci_dev.msix_entries_nr) {
 fprintf(stderr,  pci irq fd returned vector %d, msix_entries_nr %d
 \n,
 vector, proxy-pci_dev.msix_entries_nr); --- I added
 one output line here.
 return -EINVAL;
 
 }...
 The output is:
 
pci irq fd returned vector 65535, msix_entries_nr 3, EINVAL is
 returned.
 
 thanks
 Shirley Ma
 IBM Linux Technology Center
 15300 SW Koll Parkway
 Beaverton, OR 97006-6063
 Phone(Fax): (503) 578-7638
 
 
 
 

  Michael S.   
  Tsirkin  
  m...@redhat.com   To
Shirley Ma/Beaverton/i...@ibmus  
  10/20/2009 04:34   cc
  AMs...@linux.vnet.ibm.com, David   
Stevens/Beaverton/i...@ibmus,
kvm@vger.kernel.org 
Subject
Re: vhost-net patches   






 
 
 
 
 On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote:
  Hello Michael,
 
  They all failed with the following error
  vhost_net_init returned -7
  This is an error message from hw/virtio-net.c:virtio_net_driver_ok()
 when
  vhost_net_start() fails. It looks like dev-binding-irqfd() is failing
 in
  vhost_virtqueue_init(). Haven't yet debugged further. I have
 CONFIG_EVENTFD
  enabled in the host kernel.
 
  From the debug output, looks like the vnet-vector is not defined,
 
 what is vnet-vector?
 And what do you mean by not defined?
 
  and the
  default msix_entries_nr is 3, so it returned EINVAL from
 virtio_pci_irqfd.
  Looks we need to either disable QEMU_PCI_CAP_MSIX or define vector in
 QEMU
  configuration?
 
 You shouldn't have to do anything.
 
  I am not familiar with MSIX stuffs.
 
  Thanks
  Shirley
 
 
  Inactive hide details for sri---10/19/2009 03:56:57 PM---On Sun,
 2009-10-18 at
  19:32 +0200, Michael S. Tsirkin wrote:sri---10/19/2009 03:56:57 PM---On
 Sun,
  2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:
 
  s...@linux.vnet.ibm.com [cid]   *
  To Michael S. Tsirkin
 m...@redhat.com,
  10/19/2009 03:56 PMkvm@vger.kernel.org
 [cid]   *
  cc David
 Stevens/Beaverton/i...@ibmus, Shirley Ma/
 Beaverton/i...@ibmus
 [cid]   *
 Subject Re: vhost-net patches
 **
 
  On Sun, 2009-10-18 at 19:32 +0200, Michael S. Tsirkin wrote:
   On Sun, Oct 18, 2009 at 12:53:56PM +0200, Michael S. Tsirkin wrote:
On Fri, Oct 16, 2009 at 12:29:29PM -0700, Sridhar Samudrala wrote:
 Hi Michael,

 We are trying out your vhost-net patches from your git trees on
  kernel.org.
 I am using mst/vhost.git as host kernel and mst/qemu-kvm.git for
 qemu.

 I am using the following qemu script to start the guest using
 userspace
  tap backend.


 home/sridhar/git/mst/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 /home/
  sridhar/kvm_images/fedora10-1-vm -m 512 -drive
 file=/home/sridhar/kvm_images/
  fedora10-1-vm,if=virtio,index=0,boot=on -net nic,macaddr=
  54:52:00:35:e3:73,model=virtio -net
 tap,ifname=vnet0,script=no,downscript=no

 Now that i got the default backend to work, i 

Re: vhost-net patches

2009-10-20 Thread Michael S. Tsirkin
On Tue, Oct 20, 2009 at 10:27:38AM -0700, Shirley Ma wrote:
 
 Hello Michael,
 
 Here are the output, I am using guest 2.6.32-rc3 kernel. It doesn't use
 MSIX. So which guest kernel I should use??
 
 [...@localhost ~]$ cat /proc/interrupts.
CPU0
   0:299   IO-APIC-edge  timer.
   1:  2   IO-APIC-edge  i8042.
   2:  0XT-PIC-XTcascade
   4: 76   IO-APIC-edge  serial
  11:   2126   IO-APIC-edge  virtio1, virtio0 - here is the
 virtio for both disk and network i/o??

Yes, this is regular shared IRQ, no good.
I think your guest is too old, please use kernel 2.6.31 and up in guest.
I will work to improve the error message as well.

  12: 89   IO-APIC-edge  i8042
 NMI:  0   Non-maskable interrupts
 LOC:   5146   Local timer interrupts
 SPU:  0   Spurious interrupts
 CNT:  0   Performance counter interrupts
 PND:  0   Performance pending work
 RES:  0   Rescheduling interrupts
 CAL:  0   Function call interrupts
 TLB:  0   TLB shootdowns
 TRM:  0   Thermal event interrupts
 MCE:  0   Machine check exceptions
 MCP:  1   Machine check polls
 ERR:  0
 MIS:  0
 [...@localhost ~]$ uname -r
 2.6.32-rc3
 
 
 Shirley Ma
 IBM Linux Technology Center
 15300 SW Koll Parkway
 Beaverton, OR 97006-6063
 Phone(Fax): (503) 578-7638
 
 
 
 

  Michael S.   
  Tsirkin  
  m...@redhat.com   To
Shirley Ma/Beaverton/i...@ibmus  
  10/20/2009 10:18   cc
  AMDavid Stevens/Beaverton/i...@ibmus,
kvm@vger.kernel.org,
s...@linux.vnet.ibm.com  
Subject
Re: vhost-net patches   






 
 
 
 
 On Tue, Oct 20, 2009 at 10:14:55AM -0700, Shirley Ma wrote:
 
  Hello Michael,
 
  what is vnet-vector?
  And what do you mean by not defined?
 
  In funcation:
 
  static int vhost_virtqueue_init()
  {
  ..
r = vdev-binding-irqfd(vdev-binding_opaque, q-vector,
  vq-call);
  ..
  }.
 
  q-vector is 65535,
 
 Thanks for debugging this.
 I think this means that guest does not use MSI-X.
 
 You can verify this by booting guest without vhost,
 and performing the following command:
   cat /proc/interrupts
 
 Please note that you currently need recent kernel in guest,
 so that it uses MSI-X. I plan on implementing regular IRQ,
 but not yet, and it will be slower anyway.
 
  in   static int virtio_pci_irqfd()
  {
  ..
  if (vector = proxy-pci_dev.msix_entries_nr) {
  fprintf(stderr,  pci irq fd returned vector %d, msix_entries_nr
 %d
  \n,
  vector, proxy-pci_dev.msix_entries_nr); --- I added
  one output line here.
  return -EINVAL;
  
  }...
  The output is:
 
 pci irq fd returned vector 65535, msix_entries_nr 3, EINVAL is
  returned.
 
  thanks
  Shirley Ma
  IBM Linux Technology Center
  15300 SW Koll Parkway
  Beaverton, OR 97006-6063
  Phone(Fax): (503) 578-7638
 
 
 
 
 
   Michael S.
   Tsirkin
   m...@redhat.com
 To
 Shirley Ma/Beaverton/i...@ibmus
   10/20/2009 04:34
 cc
   AMs...@linux.vnet.ibm.com, David
 Stevens/Beaverton/i...@ibmus,
 kvm@vger.kernel.org
 
 Subject
 Re: vhost-net patches
 
 
 
 
 
 
 
 
 
 
  On Mon, Oct 19, 2009 at 04:08:24PM -0700, Shirley Ma wrote:
   Hello Michael,
  
   They all failed with the following error
   vhost_net_init returned -7
   This is an error message from hw/virtio-net.c:virtio_net_driver_ok()
  when
   vhost_net_start() fails. It looks like dev-binding-irqfd() is
 failing
  in
   vhost_virtqueue_init(). Haven't yet debugged further. I have
  CONFIG_EVENTFD
   enabled in the host kernel.
  
   From the debug output, looks like the vnet-vector is not defined,
 
  what is vnet-vector?
 

Re: [PATCH] v3: use upstream kvm_vcpu_ioctl

2009-10-20 Thread Glauber Costa
On Tue, Oct 20, 2009 at 03:10:18PM -0200, Marcelo Tosatti wrote:
 On Tue, Oct 20, 2009 at 11:36:58AM -0200, Glauber Costa wrote:
  [v2: we already return -errno, so fix testers ]
  [v3: keep error message for apic related failures ]
  
  Signed-off-by: Glauber Costa glom...@redhat.com
 
 Dropped, does not compile.
sorry, my bad, silly mistake.

will send another

 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] v4: use upstream kvm_vcpu_ioctl

2009-10-20 Thread Glauber Costa
[v2: we already return -errno, so fix testers ]
[v3: keep error message for apic related failures ]
[v4: fix silly compile mistake ]

Signed-off-by: Glauber Costa glom...@redhat.com
---
 kvm-all.c  |3 --
 qemu-kvm-x86.c |   90 +--
 qemu-kvm.c |   31 ---
 qemu-kvm.h |1 +
 4 files changed, 48 insertions(+), 77 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 0a8aa4c..50cd1fb 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -863,7 +863,6 @@ int kvm_vm_ioctl(KVMState *s, int type, ...)
 return ret;
 }
 
-#ifdef KVM_UPSTREAM
 int kvm_vcpu_ioctl(CPUState *env, int type, ...)
 {
 int ret;
@@ -881,8 +880,6 @@ int kvm_vcpu_ioctl(CPUState *env, int type, ...)
 return ret;
 }
 
-#endif
-
 int kvm_has_sync_mmu(void)
 {
 #ifdef KVM_CAP_SYNC_MMU
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index fb70ede..c1d0ae9 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -174,18 +174,11 @@ static int kvm_handle_tpr_access(CPUState *env)
 
 int kvm_enable_vapic(CPUState *env, uint64_t vapic)
 {
-   int r;
struct kvm_vapic_addr va = {
.vapic_addr = vapic,
};
 
-   r = ioctl(env-kvm_fd, KVM_SET_VAPIC_ADDR, va);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_enable_vapic);
-   return r;
-   }
-   return 0;
+   return kvm_vcpu_ioctl(env, KVM_SET_VAPIC_ADDR, va);
 }
 
 #endif
@@ -283,28 +276,29 @@ int kvm_destroy_memory_alias(kvm_context_t kvm, uint64_t 
phys_start)
 
 int kvm_get_lapic(CPUState *env, struct kvm_lapic_state *s)
 {
-   int r;
+int r = 0;
+
if (!kvm_irqchip_in_kernel())
-   return 0;
-   r = ioctl(env-kvm_fd, KVM_GET_LAPIC, s);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_get_lapic);
-   }
-   return r;
+   return r;
+
+   r = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, s);
+if (r  0)
+fprintf(stderr, KVM_GET_LAPIC failed\n);
+return r;
 }
 
 int kvm_set_lapic(CPUState *env, struct kvm_lapic_state *s)
 {
-   int r;
+int r = 0;
+
if (!kvm_irqchip_in_kernel())
return 0;
-   r = ioctl(env-kvm_fd, KVM_SET_LAPIC, s);
-   if (r == -1) {
-   r = -errno;
-   perror(kvm_set_lapic);
-   }
-   return r;
+
+   r = kvm_vcpu_ioctl(env, KVM_SET_LAPIC, s);
+
+if (r  0)
+fprintf(stderr, KVM_SET_LAPIC failed\n);
+return r;
 }
 
 #endif
@@ -356,7 +350,6 @@ int kvm_has_pit_state2(kvm_context_t kvm)
 void kvm_show_code(CPUState *env)
 {
 #define SHOW_CODE_LEN 50
-   int fd = env-kvm_fd;
struct kvm_regs regs;
struct kvm_sregs sregs;
int r, n;
@@ -365,13 +358,13 @@ void kvm_show_code(CPUState *env)
char code_str[SHOW_CODE_LEN * 3 + 1];
unsigned long rip;
 
-   r = ioctl(fd, KVM_GET_SREGS, sregs);
-   if (r == -1) {
+   r = kvm_vcpu_ioctl(env, KVM_GET_SREGS, sregs);
+   if (r  0 ) {
perror(KVM_GET_SREGS);
return;
}
-   r = ioctl(fd, KVM_GET_REGS, regs);
-   if (r == -1) {
+   r = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs);
+   if (r  0) {
perror(KVM_GET_REGS);
return;
}
@@ -420,29 +413,25 @@ struct kvm_msr_list *kvm_get_msr_list(kvm_context_t kvm)
 int kvm_get_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n)
 {
 struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs);
-int r, e;
+int r;
 
 kmsrs-nmsrs = n;
 memcpy(kmsrs-entries, msrs, n * sizeof *msrs);
-r = ioctl(env-kvm_fd, KVM_GET_MSRS, kmsrs);
-e = errno;
+r = kvm_vcpu_ioctl(env, KVM_GET_MSRS, kmsrs);
 memcpy(msrs, kmsrs-entries, n * sizeof *msrs);
 free(kmsrs);
-errno = e;
 return r;
 }
 
 int kvm_set_msrs(CPUState *env, struct kvm_msr_entry *msrs, int n)
 {
 struct kvm_msrs *kmsrs = qemu_malloc(sizeof *kmsrs + n * sizeof *msrs);
-int r, e;
+int r;
 
 kmsrs-nmsrs = n;
 memcpy(kmsrs-entries, msrs, n * sizeof *msrs);
-r = ioctl(env-kvm_fd, KVM_SET_MSRS, kmsrs);
-e = errno;
+r = kvm_vcpu_ioctl(env, KVM_SET_MSRS, kmsrs);
 free(kmsrs);
-errno = e;
 return r;
 }
 
@@ -464,7 +453,7 @@ int kvm_get_mce_cap_supported(kvm_context_t kvm, uint64_t 
*mce_cap,
 int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap)
 {
 #ifdef KVM_CAP_MCE
-return ioctl(env-kvm_fd, KVM_X86_SETUP_MCE, mcg_cap);
+return kvm_vcpu_ioctl(env, KVM_X86_SETUP_MCE, mcg_cap);
 #else
 return -ENOSYS;
 #endif
@@ -473,7 +462,7 @@ int kvm_setup_mce(CPUState *env, uint64_t *mcg_cap)
 int kvm_set_mce(CPUState *env, struct kvm_x86_mce *m)
 {
 #ifdef KVM_CAP_MCE
-return ioctl(env-kvm_fd, KVM_X86_SET_MCE, m);
+return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, m);
 #else
 return -ENOSYS;
 #endif
@@ -496,13 +485,12 @@ static void print_dt(FILE *file, const char *name, struct 
kvm_dtable 

Re: List of unaccessible x86 states

2009-10-20 Thread Marcelo Tosatti
On Tue, Oct 20, 2009 at 03:01:15PM +0200, Jan Kiszka wrote:
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit volatile ATM,
 this is an attempt to collect the precise requirements for additional
 state fields. Once everyone feels the list is complete, we can decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)

Should save/restore the MCE MSRs (its contents are currently
lost/overwritten AFAICS).

MTRR contents are also dropped.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


GDB + KVM Debug

2009-10-20 Thread Saksena, Abhishek
I have now tried using both


Set arch i8086 and 
Set arch i386:x86-64:intel 

But still see the same issue. Do I need to apply any patch?


Abhishek

-Original Message-
From: Jan Kiszka [mailto:jan.kis...@siemens.com] 
Sent: Thursday, September 17, 2009 1:36 AM
To: Saksena, Abhishek
Cc: kvm@vger.kernel.org
Subject: Re: GDB + KVM Debug

Saksena, Abhishek wrote:
 I am using KVM-88. However I can't get gdb still working. I stared qemu with 
 -s -S option and when I try to connect gdb to it I get following error:-
 
 (gdb) target remote lochost:1234
 lochost: unknown host
 lochost:1234: No such file or directory.
 (gdb) target remote locahost:1234
 locahost: unknown host
 locahost:1234: No such file or directory.
 (gdb) target remote localhost:1234
 Remote debugging using localhost:1234
 [New Thread 1]
 Remote 'g' packet reply is too long: 
 2306f0ff023002f07f03000
0
 (gdb)
 

Try 'set arch target-architecture' before connecting. This is required
if you didn't load the corresponding target image into gdb.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:48, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:37, Jan Kiszka wrote:
 
 Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit
 volatile ATM,
 this is an attempt to collect the precise requirements for
 additional
 state fields. Once everyone feels the list is complete, we can
 decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it require more?)
 
 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side
 knows how to
 sync it.
 
 BTW, GIF is related to svm nesting, right?
 
 Yes and no. It's an architecture addition that came with SVM, yes.
 
 The problem is that I don't want to support migrating while in a
 Why not?
 
 Because then we'd have to transfer the whole host cpu cache and the
 merged intercept bitmaps to userspace as well. That's just too many
 internals to expose IMHO.
 
But the amount of information is constant no matter how l2 guest there
are. Correct? We can expose it as separate substate.

 nested VM. We can just #VMEXIT just before migrating with a
 VMEXIT_INTR intercept.
 
 We don't notify kernel about migration currently. CPU state is
 migrated
 when VM is already paused, how we can exit nested guest at this point?
 
 Hm - introduce a new ioctl? I haven't fully thought it through yet :-).
 
There is not software problem that can't be solved by introducing new
ioctl :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 20:55, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it require  
more?)


Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side
knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM, yes.

The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the
merged intercept bitmaps to userspace as well. That's just too many
internals to expose IMHO.


But the amount of information is constant no matter how l2 guest there
are. Correct? We can expose it as separate substate.


Or we can just not migrate while in a nested guest :-). Which will  
make everything a lot easier.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 20:55, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:48, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:37, Jan Kiszka wrote:
 
 Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit
 volatile ATM,
 this is an attempt to collect the precise requirements for
 additional
 state fields. Once everyone feels the list is complete, we can
 decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it
 require more?)
 
 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side
 knows how to
 sync it.
 
 BTW, GIF is related to svm nesting, right?
 
 Yes and no. It's an architecture addition that came with SVM, yes.
 
 The problem is that I don't want to support migrating while in a
 Why not?
 
 Because then we'd have to transfer the whole host cpu cache and the
 merged intercept bitmaps to userspace as well. That's just too many
 internals to expose IMHO.
 
 But the amount of information is constant no matter how l2 guest there
 are. Correct? We can expose it as separate substate.
 
 Or we can just not migrate while in a nested guest :-). Which will
 make everything a lot easier.
 
Suppose we have a l2 guest that handles interrupt/nmis by itself how can we
force it to exit? I don't think requesting certain cpu state before
migration is the right thing to do. What if user paused a VM and then
decided to migrate? Or VM was paused automatically because of shortage
of disk space and management want to migrate VM to other host with
bigger disk?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] move hlt exit to arch-specific code, and use upstream version.

2009-10-20 Thread Glauber Costa
HLT exit calls directly an arch-specific function. Furthermore,
upstream qemu already places it on arch specific code, so let's follow it.

The function that handles halt itself is almost equal between them. So
let's use it.

Signed-off-by: Glauber Costa glom...@redhat.com
---
 qemu-kvm-x86.c|   14 +++---
 qemu-kvm.c|3 ---
 target-i386/kvm.c |2 ++
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index c1d0ae9..6573dc5 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -199,6 +199,9 @@ int kvm_arch_run(CPUState *env)
r = kvm_handle_tpr_access(env);
break;
 #endif
+case KVM_EXIT_HLT:
+r = kvm_handle_halt(env);
+break;
default:
r = 1;
break;
@@ -1377,17 +1380,6 @@ int kvm_arch_init_vcpu(CPUState *cenv)
 return 0;
 }
 
-int kvm_arch_halt(CPUState *env)
-{
-
-if (!((env-interrupt_request  CPU_INTERRUPT_HARD) 
- (env-eflags  IF_MASK)) 
-   !(env-interrupt_request  CPU_INTERRUPT_NMI)) {
-env-halted = 1;
-}
-return 1;
-}
-
 void kvm_arch_pre_kvm_run(void *opaque, CPUState *env)
 {
 if (!kvm_irqchip_in_kernel())
diff --git a/qemu-kvm.c b/qemu-kvm.c
index b8ae4d8..42ead38 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -1002,9 +1002,6 @@ int kvm_run(CPUState *env)
 case KVM_EXIT_MMIO:
 r = handle_mmio(env);
 break;
-case KVM_EXIT_HLT:
-r = kvm_arch_halt(env);
-break;
 case KVM_EXIT_IRQ_WINDOW_OPEN:
 break;
 case KVM_EXIT_SHUTDOWN:
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 1cf0dc3..de10ef1 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -761,6 +761,7 @@ int kvm_arch_post_run(CPUState *env, struct kvm_run *run)
 
 return 0;
 }
+#endif
 
 static int kvm_handle_halt(CPUState *env)
 {
@@ -775,6 +776,7 @@ static int kvm_handle_halt(CPUState *env)
 return 1;
 }
 
+#ifdef KVM_UPSTREAM
 int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run)
 {
 int ret = 0;
-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 21:09, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:


On 20.10.2009, at 20:55, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:48, Gleb Natapov wrote:


On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:


On 20.10.2009, at 15:37, Jan Kiszka wrote:


Alexander Graf wrote:

On 20.10.2009, at 15:01, Jan Kiszka wrote:


Hi all,

as the list of yet user-unaccessible x86 states is a bit
volatile ATM,
this is an attempt to collect the precise requirements for
additional
state fields. Once everyone feels the list is complete, we can
decide
how to partition it into one ore more substates for the new
KVM_GET/SET_VCPU_STATE interface.

What I read so far (or tried to patch already):

- nmi_masked
- nmi_pending
- nmi_injected
- kvm_queued_exception (whole struct content)
- KVM_REQ_TRIPLE_FAULT (from vcpu.requests)

Unclear points (for me) from the last discussion:

- sipi_vector
- MCE (covered via kvm_queued_exception, or does it
require more?)

Please extend or correct the list as required.


hflags. Qemu supports GIF, kvm supports GIF, but no side
knows how to
sync it.


BTW, GIF is related to svm nesting, right?


Yes and no. It's an architecture addition that came with SVM,  
yes.


The problem is that I don't want to support migrating while in a

Why not?


Because then we'd have to transfer the whole host cpu cache and the
merged intercept bitmaps to userspace as well. That's just too many
internals to expose IMHO.

But the amount of information is constant no matter how l2 guest  
there

are. Correct? We can expose it as separate substate.


Or we can just not migrate while in a nested guest :-). Which will
make everything a lot easier.

Suppose we have a l2 guest that handles interrupt/nmis by itself how  
can we

force it to exit?


If the nested hypervisor doesn't intercept INTR we don't support it  
anyways.



I don't think requesting certain cpu state before
migration is the right thing to do. What if user paused a VM and then
decided to migrate?


So pausing has to make it go out of nested guest context too?
Then we're not in the nested guest context, right? :)


Or VM was paused automatically because of shortage
of disk space and management want to migrate VM to other host with
bigger disk?


Same as before.


Really, pushing the whole nesting state over is not a good idea.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: List of unaccessible x86 states

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 09:23:22PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 21:09, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 08:59:48PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 20:55, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 03:51:02PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:48, Gleb Natapov wrote:
 
 On Tue, Oct 20, 2009 at 03:41:57PM +0200, Alexander Graf wrote:
 
 On 20.10.2009, at 15:37, Jan Kiszka wrote:
 
 Alexander Graf wrote:
 On 20.10.2009, at 15:01, Jan Kiszka wrote:
 
 Hi all,
 
 as the list of yet user-unaccessible x86 states is a bit
 volatile ATM,
 this is an attempt to collect the precise requirements for
 additional
 state fields. Once everyone feels the list is complete, we can
 decide
 how to partition it into one ore more substates for the new
 KVM_GET/SET_VCPU_STATE interface.
 
 What I read so far (or tried to patch already):
 
 - nmi_masked
 - nmi_pending
 - nmi_injected
 - kvm_queued_exception (whole struct content)
 - KVM_REQ_TRIPLE_FAULT (from vcpu.requests)
 
 Unclear points (for me) from the last discussion:
 
 - sipi_vector
 - MCE (covered via kvm_queued_exception, or does it
 require more?)
 
 Please extend or correct the list as required.
 
 hflags. Qemu supports GIF, kvm supports GIF, but no side
 knows how to
 sync it.
 
 BTW, GIF is related to svm nesting, right?
 
 Yes and no. It's an architecture addition that came with
 SVM, yes.
 
 The problem is that I don't want to support migrating while in a
 Why not?
 
 Because then we'd have to transfer the whole host cpu cache and the
 merged intercept bitmaps to userspace as well. That's just too many
 internals to expose IMHO.
 
 But the amount of information is constant no matter how l2
 guest there
 are. Correct? We can expose it as separate substate.
 
 Or we can just not migrate while in a nested guest :-). Which will
 make everything a lot easier.
 
 Suppose we have a l2 guest that handles interrupt/nmis by itself
 how can we
 force it to exit?
 
 If the nested hypervisor doesn't intercept INTR we don't support it
 anyways.
 
Why? I looked at the code briefly and it looks like we just inject
interrupt as usual instead of do nested exit if l2 does not intercept
INTR. Have I miss interpreted the code. Even if I have why not support
it?

 I don't think requesting certain cpu state before
 migration is the right thing to do. What if user paused a VM and then
 decided to migrate?
 
 So pausing has to make it go out of nested guest context too?
Probably.

 Then we're not in the nested guest context, right? :)
 
 Or VM was paused automatically because of shortage
 of disk space and management want to migrate VM to other host with
 bigger disk?
 
 Same as before.
What do you mean?

 
 
 Really, pushing the whole nesting state over is not a good idea.
 
May be just disallow migration with nested guest running then? Cross
vendor migration is not possible anyway.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote:
 HLT exit calls directly an arch-specific function. Furthermore,
 upstream qemu already places it on arch specific code, so let's follow it.
 
 The function that handles halt itself is almost equal between them. So
 let's use it.
 
kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1
too and kvm_run() will abort.

 Signed-off-by: Glauber Costa glom...@redhat.com
 ---
  qemu-kvm-x86.c|   14 +++---
  qemu-kvm.c|3 ---
  target-i386/kvm.c |2 ++
  3 files changed, 5 insertions(+), 14 deletions(-)
 
 diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
 index c1d0ae9..6573dc5 100644
 --- a/qemu-kvm-x86.c
 +++ b/qemu-kvm-x86.c
 @@ -199,6 +199,9 @@ int kvm_arch_run(CPUState *env)
   r = kvm_handle_tpr_access(env);
   break;
  #endif
 +case KVM_EXIT_HLT:
 +r = kvm_handle_halt(env);
 +break;
   default:
   r = 1;
   break;
 @@ -1377,17 +1380,6 @@ int kvm_arch_init_vcpu(CPUState *cenv)
  return 0;
  }
  
 -int kvm_arch_halt(CPUState *env)
 -{
 -
 -if (!((env-interrupt_request  CPU_INTERRUPT_HARD) 
 -   (env-eflags  IF_MASK)) 
 - !(env-interrupt_request  CPU_INTERRUPT_NMI)) {
 -env-halted = 1;
 -}
 -return 1;
 -}
 -
  void kvm_arch_pre_kvm_run(void *opaque, CPUState *env)
  {
  if (!kvm_irqchip_in_kernel())
 diff --git a/qemu-kvm.c b/qemu-kvm.c
 index b8ae4d8..42ead38 100644
 --- a/qemu-kvm.c
 +++ b/qemu-kvm.c
 @@ -1002,9 +1002,6 @@ int kvm_run(CPUState *env)
  case KVM_EXIT_MMIO:
  r = handle_mmio(env);
  break;
 -case KVM_EXIT_HLT:
 -r = kvm_arch_halt(env);
 -break;
  case KVM_EXIT_IRQ_WINDOW_OPEN:
  break;
  case KVM_EXIT_SHUTDOWN:
 diff --git a/target-i386/kvm.c b/target-i386/kvm.c
 index 1cf0dc3..de10ef1 100644
 --- a/target-i386/kvm.c
 +++ b/target-i386/kvm.c
 @@ -761,6 +761,7 @@ int kvm_arch_post_run(CPUState *env, struct kvm_run *run)
  
  return 0;
  }
 +#endif
  
  static int kvm_handle_halt(CPUState *env)
  {
 @@ -775,6 +776,7 @@ static int kvm_handle_halt(CPUState *env)
  return 1;
  }
  
 +#ifdef KVM_UPSTREAM
  int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run)
  {
  int ret = 0;
 -- 
 1.6.2.5
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.

2009-10-20 Thread Glauber Costa
On Tue, Oct 20, 2009 at 09:47:44PM +0200, Gleb Natapov wrote:
 On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote:
  HLT exit calls directly an arch-specific function. Furthermore,
  upstream qemu already places it on arch specific code, so let's follow it.
  
  The function that handles halt itself is almost equal between them. So
  let's use it.
  
 kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1
 too and kvm_run() will abort.

kvm_arch_halt() may return 1 as well.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] move hlt exit to arch-specific code, and use upstream version.

2009-10-20 Thread Gleb Natapov
On Tue, Oct 20, 2009 at 05:56:35PM -0200, Glauber Costa wrote:
 On Tue, Oct 20, 2009 at 09:47:44PM +0200, Gleb Natapov wrote:
  On Tue, Oct 20, 2009 at 05:15:32PM -0200, Glauber Costa wrote:
   HLT exit calls directly an arch-specific function. Furthermore,
   upstream qemu already places it on arch specific code, so let's follow it.
   
   The function that handles halt itself is almost equal between them. So
   let's use it.
   
  kvm_handle_halt() may return 1. If it does kvm_arch_run() will return 1
  too and kvm_run() will abort.
 
 kvm_arch_halt() may return 1 as well.
 
But it's called from another place and its return value is handled
differently.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: VMX: flush TLB with INVEPT on cpu migration

2009-10-20 Thread Max Laier
On Friday 02 October 2009 00:16:58 you wrote:
 It is possible that stale EPTP-tagged mappings are used, if a
 vcpu migrates to a different pcpu.
 
 Set KVM_REQ_TLB_FLUSH in vmx_vcpu_load, when switching pcpus, which
 will invalidate both VPID and EPT mappings on the next vm-entry.

Thank you - I was at the brink of a nervous break-down before discovering 
this.  Maybe it would help for the future to add a comment to 
ept_misconfig_inspect_spte that explains that this might be caused by out of 
sync tlbs, too (esp. when it doesn't show an apparent cause of the misconfig)

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index e86f1a6..97f4265 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -708,7 +708,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int
  cpu) if (vcpu-cpu != cpu) {
   vcpu_clear(vmx);
   kvm_migrate_timers(vcpu);
 - vpid_sync_vcpu_all(vmx);
 + set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests);
   local_irq_disable();
   list_add(vmx-local_vcpus_link,
per_cpu(vcpus_on_cpu, cpu));
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 !DSPAM:4ac52dbb832371894110002!
 

-- 
/\  Best regards,  | mla...@freebsd.org
\ /  Max Laier  | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mla...@efnet
/ \  ASCII Ribbon Campaign  | Against HTML Mail and News
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add 'downscript=no' into kvm command line

2009-10-20 Thread Yolkfull Chow
If no downscript is assigned, add 'downscript=no' to avoid error:

/etc/qemu-ifdown: could not launch network script

Signed-off-by: Yolkfull Chow yz...@redhat.com
---
 client/tests/kvm/kvm_vm.py |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
index a8d96ca..0b8efbc 100755
--- a/client/tests/kvm/kvm_vm.py
+++ b/client/tests/kvm/kvm_vm.py
@@ -252,6 +252,8 @@ class VM:
 if script_path:
 script_path = kvm_utils.get_path(root_dir, script_path)
 qemu_cmd += ,downscript=%s % script_path
+else:
+qemu_cmd += ,downscript=no
 # Proceed to next NIC
 vlan += 1
 
-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-20 Thread MORITA Kazutaka

Hi everyone,

Sheepdog is a distributed storage system for KVM/QEMU. It provides
highly available block level storage volumes to VMs like Amazon EBS.
Sheepdog supports advanced volume management features such as snapshot,
cloning, and thin provisioning. Sheepdog runs on several tens or hundreds
of nodes, and the architecture is fully symmetric; there is no central
node such as a meta-data server.

The following list describes the features of Sheepdog.

* Linear scalability in performance and capacity
* No single point of failure
* Redundant architecture (data is written to multiple nodes)
- Tolerance against network failure
* Zero configuration (newly added machines will join the cluster 
automatically)
- Autonomous load balancing
* Snapshot
- Online snapshot from qemu-monitor
* Clone from a snapshot volume
* Thin provisioning
- Amazon EBS API support (to use from a Eucalyptus instance)

(* = current features, - = on our todo list)

More details and download links are here:

http://www.osrg.net/sheepdog/

Note that the code is still in an early stage.
There are some critical TODO items:

- VM image deletion support
- Support architectures other than X86_64
- Data recoverys
- Free space management
- Guarantee reliability and availability under heavy load
- Performance improvement
- Reclaim unused blocks
- More documentation

We hope finding people interested in working together.
Enjoy!


Here are examples:

- create images

$ kvm-img create -f sheepdog Alice's Disk 256G
$ kvm-img create -f sheepdog Bob's Disk 256G

- list images

$ shepherd info -t vdi
   4 : Alice's Disk  256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15
16:17:18, tag:0, current
   8 : Bob's Disk256 GB (allocated: 0 MB, shared: 0 MB), 2009-10-15
16:29:20, tag:0, current

- start up a virtual machine

$ kvm --drive format=sheepdog,file=Alice's Disk

- create a snapshot

$ kvm-img snapshot -c name sheepdog:Alice's Disk

- clone from a snapshot

$ kvm-img create -b sheepdog:Alice's Disk:0 -f sheepdog Charlie's Disk


Thanks.

--
MORITA, Kazutaka

NTT Cyber Space Labs
OSS Computing Project
Kernel Group
E-mail: morita.kazut...@lab.ntt.co.jp

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-10-20 Thread Alexander Graf


On 30.09.2009, at 15:29, Avi Kivity wrote:


On 09/30/2009 03:17 PM, Avi Kivity wrote:

 {
 struct page *page[1];
@@ -2331,7 +2374,7 @@ static int kvm_vm_mmap(struct file *file,  
struct vm_area_struct *vma)

 static struct file_operations kvm_vm_fops = {
 .release= kvm_vm_release,
 .unlocked_ioctl = kvm_vm_ioctl,
-.compat_ioctl   = kvm_vm_ioctl,
+.compat_ioctl   = kvm_vm_compat_ioctl,
 .mmap   = kvm_vm_mmap,
 };
 static int kvm_vm_fault(struct vm_area_struct *vma, struct  
vm_fault *vmf)


This is a bit painful - I tried to avoid compat_ioctl.  Maybe it's  
better to have dirty_bitmap_virt, given no existing users are  
impacted.




But that misses compat_ptr().  So it looks like we'll need  
compat_ioctl.


Patch looks fine, except s/log.log/log/.  I'd also sizeof 
(compat_log) instead of sizeof(log) to avoid frightening reviewers.


So has there been any decision on which road to take here?

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-10-20 Thread Alexander Graf


On 20.10.2009, at 15:23, Avi Kivity wrote:


On 10/20/2009 07:09 PM, Alexander Graf wrote:


On 30.09.2009, at 15:29, Avi Kivity wrote:


On 09/30/2009 03:17 PM, Avi Kivity wrote:

{
struct page *page[1];
@@ -2331,7 +2374,7 @@ static int kvm_vm_mmap(struct file *file,  
struct vm_area_struct *vma)

static struct file_operations kvm_vm_fops = {
.release= kvm_vm_release,
.unlocked_ioctl = kvm_vm_ioctl,
-.compat_ioctl   = kvm_vm_ioctl,
+.compat_ioctl   = kvm_vm_compat_ioctl,
.mmap   = kvm_vm_mmap,
};
static int kvm_vm_fault(struct vm_area_struct *vma, struct  
vm_fault *vmf)


This is a bit painful - I tried to avoid compat_ioctl.  Maybe  
it's better to have dirty_bitmap_virt, given no existing users  
are impacted.




But that misses compat_ptr().  So it looks like we'll need  
compat_ioctl.


Patch looks fine, except s/log.log/log/.  I'd also sizeof 
(compat_log) instead of sizeof(log) to avoid frightening reviewers.


So has there been any decision on which road to take here?


compat_ioctl, and being more careful in the future.


So I'll include Arnd's patch in my patchset instead?

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-10-20 Thread Avi Kivity

On 10/20/2009 10:28 PM, Alexander Graf wrote:

compat_ioctl, and being more careful in the future.



So I'll include Arnd's patch in my patchset instead?


Send it independently and Marcelo or myself will apply it.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html