[PATCH] KVM: make 'lapic_timer_ops' and 'kpit_ops' static

2009-03-12 Thread Avi Kivity
From: Hannes Eder han...@hanneseder.net

Fix this sparse warnings:
  arch/x86/kvm/lapic.c:916:22: warning: symbol 'lapic_timer_ops' was not 
declared. Should it be static?
  arch/x86/kvm/i8254.c:268:22: warning: symbol 'kpit_ops' was not declared. 
Should it be static?

Signed-off-by: Hannes Eder han...@hanneseder.net
Signed-off-by: Avi Kivity a...@redhat.com

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 4e2e3f2..cf09bb6 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -265,7 +265,7 @@ static bool kpit_is_periodic(struct kvm_timer *ktimer)
return ps-is_periodic;
 }
 
-struct kvm_timer_ops kpit_ops = {
+static struct kvm_timer_ops kpit_ops = {
.is_periodic = kpit_is_periodic,
 };
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dd934d2..4d76bb6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -913,7 +913,7 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu)
kvm_apic_local_deliver(apic, APIC_LVT0);
 }
 
-struct kvm_timer_ops lapic_timer_ops = {
+static struct kvm_timer_ops lapic_timer_ops = {
.is_periodic = lapic_is_periodic,
 };
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with virtio_net in multiple guests?

2009-03-12 Thread Jim Paris
Ken Robertson wrote:
 Hoping someone can help me track down an issue I'm experiencing on a
 KVM machine I built recently.
...
 SIOCSIFFLAGS: Cannot assign requested address
 
 The address isn't in use or anything, so no reason I can think of why
 it can't assign it.  It recognizes the device, however can't bring it
 up.  All the VMs have unique MAC addresses, randomly generated.  One
 of the ones that doesn't work is using 93:01:dc:a0:f0:57.

That MAC address is not valid.  The LSB of the first byte should be 0 to
indicate unicast, and the second LSB of the first byte should be 1 to
indicate a locally-assigned address.  e.g. 92:01:dc:a0:f0:57 should
work.

-jim



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with virtio_net in multiple guests?

2009-03-12 Thread Ken Robertson
Jim,

That was it!  I didn't realize there was some significance of certain
bits within the address.  Changing that first byte resolved the issue.
 Should I be setting the 2nd bit in the LSB to 1?  I started logging
into all the systems I have access to and realized all of them have 00
as the first byte in the address.  Should I just stick to 00 on all
mine?  Or by making that 2nd bit 1, does that force the card to
inherit an address from libvirt or somewhere else instead of the VM
configuration?  I'll play around with it some more, but at least that
mystery is solved.

BTW, was the first time posting on this list and love it!  Quickest
response I've ever gotten on a mailing list. :)

Thanks!

Ken

On Wed, Mar 11, 2009 at 11:28 PM, Jim Paris j...@jtan.com wrote:
 Ken Robertson wrote:
 Hoping someone can help me track down an issue I'm experiencing on a
 KVM machine I built recently.
 ...
 SIOCSIFFLAGS: Cannot assign requested address

 The address isn't in use or anything, so no reason I can think of why
 it can't assign it.  It recognizes the device, however can't bring it
 up.  All the VMs have unique MAC addresses, randomly generated.  One
 of the ones that doesn't work is using 93:01:dc:a0:f0:57.

 That MAC address is not valid.  The LSB of the first byte should be 0 to
 indicate unicast, and the second LSB of the first byte should be 1 to
 indicate a locally-assigned address.  e.g. 92:01:dc:a0:f0:57 should
 work.

 -jim




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issues with virtio_net in multiple guests?

2009-03-12 Thread Jim Paris
Ken Robertson wrote:
 Jim,
 
 That was it!  I didn't realize there was some significance of certain
 bits within the address.  Changing that first byte resolved the issue.
  Should I be setting the 2nd bit in the LSB to 1?  I started logging
 into all the systems I have access to and realized all of them have 00
 as the first byte in the address.  Should I just stick to 00 on all
 mine?  Or by making that 2nd bit 1, does that force the card to
 inherit an address from libvirt or somewhere else instead of the VM
 configuration?  I'll play around with it some more, but at least that
 mystery is solved.

The 2nd LSB of the first byte just says it's a locally generated
address rather than one of the officially-assigned OUIs.

The 1st LSB of the first byte is the important one as that's what the
Linux kernel checks and rejects if it's set (it also rejects an
address of all zeros).

-jim
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-autotest -- introducing kvm_runtest_2

2009-03-12 Thread Ryan Harper
* Michael Goldish mgold...@redhat.com [2009-03-12 02:26]:
 
   Regarding the stepfiles you created for Linux -- I can't help much
   with those since I don't have the data. I do believe that if I had
  the
   data and the stepfiles I could quickly identify the problem, so if
  you
   think those can be sent to us, I'd like to have them.
  
  I created a stepfile for RHEL5 and what I'm seeing is that one of the
  screens I captured in stepmaker ended up having a focus ring around
  something and on replay the focus isn't there.  This situation isn't
  something that a new algo will fix as you pointed out.  I'm wondering
  if
  this is something you've seen.  I don't quite understand how it would
  happen since stepmaker and the replace send the same keystrokes.  I
  also
  don't see how in general this can be avoided.
 
 The problem sounds familiar. Does the ring appear around one of the
 GNOME menubars, i.e. around Applications or System? GNOME seems to
 be somewhat indeterministic with those rings. If you run the stepfile
 several times, you'll notice that in most cases you'll see a focus
 ring (or no focus ring, I don't quite remember) and the rest of the
 time you'll get the other case.

Ding Ding Ding! =)

 
 This can be avoided either with experience, or a good wiki entry on
 picking the right barriers (which we plan to create). But you don't
 have to avoid making mistakes with stepmaker -- most types of mistakes
 are fixed very quickly and easily with stepeditor.

yep, used stepeditor to fix; defintely worth documenting where one
should be invoking stepeditor -- from the steps dir; if you don't run it
from there, it won't find the steps_data dir =(

 
 The fix depends on exactly what you were trying to do:
 
 - If you sent alt-f1 to open the menu, and in the following step
 picked the open menu (including the Applications caption itself) to
 make sure it was open -- use stepeditor to modify the barrier so that
 it doesn't include the Applications caption or anything that might
 have a ring around it.

That worked for me.


 
 The following text was copied from your previous e-mail:
 
  I do have the debug dir data from these runs.  Looking at the cropped
  ppm and screendump ppm is how I determined that there must be
  something
  wrong with how the image is rendered since the cropped ppm matches
  the
  screendump output, but with whatever subtle difference that generates
  a
  different md5sum.
 
 I'm not sure my previous e-mail was clear enough, so just in case it
 wasn't, let me rephrase: The cropped ppm is generated from the
 screendump ppm every time the stepfile running module receives a
 screendump from the guest in order to see if it matches a barrier.
 This is done for debugging purposes. If you somehow check, you'll see
 there is no subtle difference between those two files. It wouldn't
 make sense to find a subtle difference between them, and if you did
 find one, it certainly wouldn't indicate a stepfile problem, but
 rather a very strange bug in the framework.  You should be looking for
 subtle differences between the screendump ppm and the reference
 screendump ppm, as well as between the cropped screendump ppm and the
 reference cropped screendump ppm. By reference I mean coming from
 the stepmaker data. If you don't have the stepmaker data, you have no
 way of knowing what caused the difference in the md5sums.

Right -- the real win was comparing the full screendump to the reference
screendump - basically, without the reference dumps, the debug output
isn't useful.

I'll have to go back and re-read your email on where to put the
reference ppm files so one gets the refrence comparision.

 
 
 There are two other things I forgot to mention in my previous e-mail:
 
 The Windows failures you're seeing might be caused by KVM bugs other
 than the one I mentioned. KVM-84 has a very strong tendency to crash
 during Windows installations. You can use the logs to find out if that
 happened in your case. If you have the latest git HEAD the exception
 info will look something like Barrier timed out at step ... (VM is
 dead), and if you have a slightly older version, you'll probably see
 (guest is stuck) at the end of the info string. You should also see
 the system consistently complaining that it can't fetch any
 screendumps from the guest (this will appear in stdout).

I've seen those on kvm-84.

 The other thing has to do with the ISO files. kvm_runtest has a very
 important feature that we innocently forgot to implement in
 kvm_runtest_2 -- md5sum verification of the ISO files. This means that
 the framework currently makes no use of the md5sum and md5sum_1m
 parameters in the config file. This means you might be using different
 ISOs than the ones we made our stepfiles with. In that case I wouldn't
 expect any stepfile to succeed. However, if you used these same ISOs
 with kvm_runtest then they should be fine. In any case, I'll add the
 feature ASAP to the git repository.

Right - I 

Re: Problem with KVM-84 and more than 4 processors

2009-03-12 Thread Avi Kivity

Matthias Hovestadt wrote:

Hi!

I am unable to reproduce - 'modprobe kvm' gets me the expected lsmod 
line.

Can you reproduce with plain 2.6.27 instead of the gentoo build?


No, with a vanilla kernel from kernel.org it seems to work fine.


Please take it up with the gentoo kernel team then.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] compile fix - avoid raw string literal

2009-03-12 Thread Jochen Roth
This patch fixes compilation problems of kvm-userspace on current gcc 
4.4 compilers which implement the following standard: 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm


Signed-off-by: Jochen Roth jr...@linux.vnet.ibm.com

diff --git a/user/test/x86/apic.c b/user/test/x86/apic.c
index 9c6205b..7794615 100644
--- a/user/test/x86/apic.c
+++ b/user/test/x86/apic.c
@@ -54,14 +54,14 @@ asm (
 push %r9  \n\t
 push %r8  \n\t
 #endif
-push %Rdi \n\t
-push %Rsi \n\t
-push %Rbp \n\t
-push %Rsp \n\t
-push %Rbx \n\t
-push %Rdx \n\t
-push %Rcx \n\t
-push %Rax \n\t
+push %R di \n\t
+push %R si \n\t
+push %R bp \n\t
+push %R sp \n\t
+push %R bx \n\t
+push %R dx \n\t
+push %R cx \n\t
+push %R ax \n\t
 #ifdef __x86_64__
 mov %rsp, %rdi \n\t
 callq *8*16(%rsp) \n\t
@@ -70,14 +70,14 @@ asm (
 calll *4+4*8(%esp) \n\t
 add $4, %esp \n\t
 #endif
-pop %Rax \n\t
-pop %Rcx \n\t
-pop %Rdx \n\t
-pop %Rbx \n\t
-pop %Rbp \n\t
-pop %Rbp \n\t
-pop %Rsi \n\t
-pop %Rdi \n\t
+pop %R ax \n\t
+pop %R cx \n\t
+pop %R dx \n\t
+pop %R bx \n\t
+pop %R bp \n\t
+pop %R bp \n\t
+pop %R si \n\t
+pop %R di \n\t
 #ifdef __x86_64__
 pop %r8  \n\t
 pop %r9  \n\t
diff --git a/user/test/x86/vmexit.c b/user/test/x86/vmexit.c
index bd57bfa..981d6c1 100644
--- a/user/test/x86/vmexit.c
+++ b/user/test/x86/vmexit.c
@@ -31,7 +31,7 @@ int main()

t1 = rdtsc();
for (i = 0; i  N; ++i)
-   asm volatile (push %%Rbx; cpuid; pop %%Rbx
+   asm volatile (push %%R bx; cpuid; pop %%R bx
  : : : eax, ecx, edx);
t2 = rdtsc();
printf(vmexit latency: %d\n, (int)((t2 - t1) / N));

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/16] kvm: deassign irq for INTx

2009-03-12 Thread Sheng Yang
From: Marcelo Tosatti mtosa...@redhat.com

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/device-assignment.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 7c73210..19848b4 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -536,6 +536,14 @@ static int assign_irq(AssignedDevInfo *adev)
 calc_assigned_dev_id(dev-h_busnr, dev-h_devfn);
 assigned_irq_data.guest_irq = irq;
 assigned_irq_data.host_irq = dev-real_device.irq;
+#ifdef KVM_CAP_ASSIGN_DEV_IRQ
+assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | KVM_DEV_IRQ_GUEST_INTX;
+r = kvm_deassign_irq(kvm_context, assigned_irq_data);
+/* -ENXIO means no assigned irq */
+if (r  r != -ENXIO)
+perror(assign_irq: deassign);
+#endif
+
 r = kvm_assign_irq(kvm_context, assigned_irq_data);
 if (r  0) {
 fprintf(stderr, Failed to assign irq for \%s\: %s\n,
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] kvm: Support MSI convert to INTx in device assignment

2009-03-12 Thread Sheng Yang

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/device-assignment.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index bda0e95..01485d7 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -588,7 +588,11 @@ static int assign_irq(AssignedDevInfo *adev)
 assigned_irq_data.guest_irq = irq;
 assigned_irq_data.host_irq = dev-real_device.irq;
 #ifdef KVM_CAP_ASSIGN_DEV_IRQ
-assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | KVM_DEV_IRQ_GUEST_INTX;
+if (dev-cap.available  ASSIGNED_DEVICE_CAP_MSI)
+assigned_irq_data.flags = KVM_DEV_IRQ_HOST_MSI | 
KVM_DEV_IRQ_GUEST_INTX;
+else
+assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | 
KVM_DEV_IRQ_GUEST_INTX;
+
 r = kvm_deassign_irq(kvm_context, assigned_irq_data);
 /* -ENXIO means no assigned irq */
 if (r  r != -ENXIO)
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/16] Add MSI-X related macro to pci.c

2009-03-12 Thread Sheng Yang

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/pci.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h
index 127dbed..1392626 100644
--- a/qemu/hw/pci.h
+++ b/qemu/hw/pci.h
@@ -206,6 +206,7 @@ typedef struct PCIIORegion {
 #define PCI_CAPABILITY_CONFIG_MAX_LENGTH 0x60
 #define PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR 0x40
 #define PCI_CAPABILITY_CONFIG_MSI_LENGTH 0x10
+#define PCI_CAPABILITY_CONFIG_MSIX_LENGTH 0x10
 
 struct PCIDevice {
 /* PCI config space */
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/16 v5] Device assignment improvement in userspace

2009-03-12 Thread Sheng Yang
Patch 1 and 2 are new ones, all the others had been sent before.

This (huge) patchset, contained:

Patch 1..2 are new interface after reworked device assignment kernel part.

Patch 3..6 are generic capability support mechanism. These may can be adopted
by QEmu upstream as well.

Patch 7..10 enable MSI with device assignment on KVM. Also due to reworked
device assignment kernel part discard MSI convert to INTx mechanism, patch 10
enable it again in userspace.

Patch 11..13 enable MSI-X with device assignment on KVM.

And Patch 14..16 enable SR-IOV with KVM.

Update from latest series:

1. Convert to the new ioctl interface.
2. Merge capability configuration space with PCIDevice one.
3. Support of deassign IRQ(unload driver) with MSI/MSI-X better.
4. Not assume IRQ0 means no INTx any longer, but check interrupt pin field in
configuration space for the judgment.

Please help to review! Thanks!

--
 libkvm/kvm-common.h |1 +
 libkvm/libkvm.c |  176 +--
 libkvm/libkvm.h |   58 +-
 qemu/Makefile.target|1 +
 qemu/configure  |   20 ++
 qemu/hw/device-assignment.c |  526 +--
 qemu/hw/device-assignment.h |   17 ++
 qemu/hw/pci.c   |   77 ++-
 qemu/hw/pci.h   |   38 +++
 9 files changed, 871 insertions(+), 43 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/16] kvm: user interface for MSI type irq routing

2009-03-12 Thread Sheng Yang

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 libkvm/libkvm.c |   98 ---
 libkvm/libkvm.h |   22 
 2 files changed, 101 insertions(+), 19 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 80a0481..e9bae23 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1265,11 +1265,12 @@ int kvm_clear_gsi_routes(kvm_context_t kvm)
 #endif
 }
 
-int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin)
+int kvm_add_routing_entry(kvm_context_t kvm,
+ struct kvm_irq_routing_entry* entry)
 {
 #ifdef KVM_CAP_IRQ_ROUTING
struct kvm_irq_routing *z;
-   struct kvm_irq_routing_entry *e;
+   struct kvm_irq_routing_entry *new;
int n, size;
 
if (kvm-irq_routes-nr == kvm-nr_allocated_irq_routes) {
@@ -1277,7 +1278,7 @@ int kvm_add_irq_route(kvm_context_t kvm, int gsi, int 
irqchip, int pin)
if (n  64)
n = 64;
size = sizeof(struct kvm_irq_routing);
-   size += n * sizeof(*e);
+   size += n * sizeof(*new);
z = realloc(kvm-irq_routes, size);
if (!z)
return -ENOMEM;
@@ -1285,34 +1286,77 @@ int kvm_add_irq_route(kvm_context_t kvm, int gsi, int 
irqchip, int pin)
kvm-irq_routes = z;
}
n = kvm-irq_routes-nr++;
-   e = kvm-irq_routes-entries[n];
-   memset(e, 0, sizeof(*e));
-   e-gsi = gsi;
-   e-type = KVM_IRQ_ROUTING_IRQCHIP;
-   e-flags = 0;
-   e-u.irqchip.irqchip = irqchip;
-   e-u.irqchip.pin = pin;
+   new = kvm-irq_routes-entries[n];
+   memset(new, 0, sizeof(*new));
+   new-gsi = entry-gsi;
+   new-type = entry-type;
+   new-flags = entry-flags;
+   new-u = entry-u;
return 0;
 #else
return -ENOSYS;
 #endif
 }
 
-int kvm_del_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin)
+int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin)
+{
+#ifdef KVM_CAP_IRQ_ROUTING
+   struct kvm_irq_routing_entry e;
+
+   e.gsi = gsi;
+   e.type = KVM_IRQ_ROUTING_IRQCHIP;
+   e.flags = 0;
+   e.u.irqchip.irqchip = irqchip;
+   e.u.irqchip.pin = pin;
+   return kvm_add_routing_entry(kvm, e);
+#else
+   return -ENOSYS;
+#endif
+}
+
+int kvm_del_routing_entry(kvm_context_t kvm,
+ struct kvm_irq_routing_entry* entry)
 {
 #ifdef KVM_CAP_IRQ_ROUTING
struct kvm_irq_routing_entry *e, *p;
-   int i;
+   int i, found = 0;
 
for (i = 0; i  kvm-irq_routes-nr; ++i) {
e = kvm-irq_routes-entries[i];
-   if (e-type == KVM_IRQ_ROUTING_IRQCHIP
-e-gsi == gsi
-e-u.irqchip.irqchip == irqchip
-e-u.irqchip.pin == pin) {
-   p = kvm-irq_routes-entries[--kvm-irq_routes-nr];
-   *e = *p;
-   return 0;
+   if (e-type == entry-type
+e-gsi == entry-gsi) {
+   switch (e-type)
+   {
+   case KVM_IRQ_ROUTING_IRQCHIP: {
+   if (e-u.irqchip.irqchip ==
+   entry-u.irqchip.irqchip
+e-u.irqchip.pin ==
+   entry-u.irqchip.pin) {
+   p = kvm-irq_routes-
+   entries[--kvm-irq_routes-nr];
+   *e = *p;
+   found = 1;
+   }
+   break;
+   }
+   case KVM_IRQ_ROUTING_MSI: {
+   if (e-u.msi.address_lo ==
+   entry-u.msi.address_lo
+e-u.msi.address_hi ==
+   entry-u.msi.address_hi
+e-u.msi.data == entry-u.msi.data) {
+   p = kvm-irq_routes-
+   entries[--kvm-irq_routes-nr];
+   *e = *p;
+   found = 1;
+   }
+   break;
+   }
+   default:
+   break;
+   }
+   if (found)
+   return 0;
}
}
return -ESRCH;
@@ -1321,6 +1365,22 @@ int kvm_del_irq_route(kvm_context_t kvm, int gsi, int 
irqchip, int pin)
 #endif
 }
 
+int kvm_del_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin)
+{
+#ifdef KVM_CAP_IRQ_ROUTING
+   struct kvm_irq_routing_entry e;
+
+   e.gsi = gsi;
+   

[PATCH 03/16] kvm: Replace force type convert with container_of()

2009-03-12 Thread Sheng Yang

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/device-assignment.c |   20 
 1 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 19848b4..e8a69ba 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -144,7 +144,7 @@ static uint32_t assigned_dev_ioport_readl(void *opaque, 
uint32_t addr)
 static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num,
uint32_t e_phys, uint32_t e_size, int type)
 {
-AssignedDevice *r_dev = (AssignedDevice *) pci_dev;
+AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev);
 AssignedDevRegion *region = r_dev-v_addrs[region_num];
 uint32_t old_ephys = region-e_physbase;
 uint32_t old_esize = region-e_size;
@@ -175,7 +175,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int 
region_num,
 static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num,
 uint32_t addr, uint32_t size, int type)
 {
-AssignedDevice *r_dev = (AssignedDevice *) pci_dev;
+AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev);
 AssignedDevRegion *region = r_dev-v_addrs[region_num];
 int first_map = (region-e_size == 0);
 CPUState *env;
@@ -224,6 +224,7 @@ static void assigned_dev_pci_write_config(PCIDevice *d, 
uint32_t address,
 {
 int fd;
 ssize_t ret;
+AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev);
 
 DEBUG((%x.%x): address=%04x val=0x%08x len=%d\n,
   ((d-devfn  3)  0x1F), (d-devfn  0x7),
@@ -245,7 +246,7 @@ static void assigned_dev_pci_write_config(PCIDevice *d, 
uint32_t address,
   ((d-devfn  3)  0x1F), (d-devfn  0x7),
   (uint16_t) address, val, len);
 
-fd = ((AssignedDevice *)d)-real_device.config_fd;
+fd = pci_dev-real_device.config_fd;
 
 again:
 ret = pwrite(fd, val, len, address);
@@ -266,6 +267,7 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, 
uint32_t address,
 uint32_t val = 0;
 int fd;
 ssize_t ret;
+AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev);
 
 if ((address = 0x10  address = 0x24) || address == 0x34 ||
 address == 0x3c || address == 0x3d) {
@@ -279,7 +281,7 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, 
uint32_t address,
 if (address == 0xFC)
 goto do_log;
 
-fd = ((AssignedDevice *)d)-real_device.config_fd;
+fd = pci_dev-real_device.config_fd;
 
 again:
 ret = pread(fd, val, len, address);
@@ -618,15 +620,17 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 {
 int r;
 AssignedDevice *dev;
+PCIDevice *pci_dev;
 uint8_t e_device, e_intx;
 
 DEBUG(Registering real physical device %s (bus=%x dev=%x func=%x)\n,
   adev-name, adev-bus, adev-dev, adev-func);
 
-dev = (AssignedDevice *)
-pci_register_device(bus, adev-name, sizeof(AssignedDevice),
--1, assigned_dev_pci_read_config,
-assigned_dev_pci_write_config);
+pci_dev = pci_register_device(bus, adev-name,
+  sizeof(AssignedDevice), -1, assigned_dev_pci_read_config,
+  assigned_dev_pci_write_config);
+dev = container_of(pci_dev, AssignedDevice, dev);
+
 if (NULL == dev) {
 fprintf(stderr, %s: Error: Couldn't register real device %s\n,
 __func__, adev-name);
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/16] kvm: add ioctl KVM_SET_MSIX_ENTRY_NR and KVM_SET_MSIX_ENTRY

2009-03-12 Thread Sheng Yang

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 libkvm/libkvm.c |   25 +
 libkvm/libkvm.h |7 +++
 2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 405b0bf..f8129a4 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1410,3 +1410,28 @@ int kvm_get_irq_route_gsi(kvm_context_t kvm)
 return KVM_IOAPIC_NUM_PINS;
 }
 
+#ifdef KVM_CAP_DEVICE_MSIX
+int kvm_assign_set_msix_nr(kvm_context_t kvm,
+   struct kvm_assigned_msix_nr *msix_nr)
+{
+int ret;
+
+ret = ioctl(kvm-vm_fd, KVM_ASSIGN_SET_MSIX_NR, msix_nr);
+if (ret  0)
+return -errno;
+
+return ret;
+}
+
+int kvm_assign_set_msix_entry(kvm_context_t kvm,
+  struct kvm_assigned_msix_entry *entry)
+{
+int ret;
+
+ret = ioctl(kvm-vm_fd, KVM_ASSIGN_SET_MSIX_ENTRY, entry);
+if (ret  0)
+return -errno;
+
+return ret;
+}
+#endif
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 9a7cbc6..d3e431a 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -854,4 +854,11 @@ int kvm_commit_irq_routes(kvm_context_t kvm);
  * \param kvm Pointer to the current kvm_context
  */
 int kvm_get_irq_route_gsi(kvm_context_t kvm);
+
+#ifdef KVM_CAP_DEVICE_MSIX
+int kvm_assign_set_msix_nr(kvm_context_t kvm,
+  struct kvm_assigned_msix_nr *msix_nr);
+int kvm_assign_set_msix_entry(kvm_context_t kvm,
+  struct kvm_assigned_msix_entry *entry);
+#endif
 #endif
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] KVM: Fill config with correct VID/DID

2009-03-12 Thread Sheng Yang
SRIOV's virtual function didn't show correct Vendor ID/Device ID in config, so
we have to fill them manually according to device/vendor file in sysfs.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/device-assignment.c |   31 ++-
 1 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 69f8e3a..ea67ce9 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -317,7 +317,8 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, 
uint32_t address,
 ssize_t ret;
 AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev);
 
-if ((address = 0x10  address = 0x24) || address == 0x34 ||
+if (address  0x4 ||
+   (address = 0x10  address = 0x24) || address == 0x34 ||
 address == 0x3c || address == 0x3d ||
 pci_access_cap_config(d, address, len)) {
 val = pci_default_read_config(d, address, len);
@@ -429,6 +430,7 @@ static int get_real_device(AssignedDevice *pci_dev, uint8_t 
r_bus,
 int fd, r = 0;
 FILE *f;
 unsigned long long start, end, size, flags;
+unsigned long id;
 PCIRegion *rp;
 PCIDevRegions *dev = pci_dev-real_device;
 
@@ -488,6 +490,33 @@ again:
 DEBUG(region %d size %d start 0x%llx type %d resource_fd %d\n,
   r, rp-size, start, rp-type, rp-resource_fd);
 }
+
+fclose(f);
+
+/* read and fill device ID */
+snprintf(name, sizeof(name), %svendor, dir);
+f = fopen(name, r);
+if (f == NULL) {
+fprintf(stderr, %s: %s: %m\n, __func__, name);
+return 1;
+}
+if (fscanf(f, %li\n, id) == 1) {
+   pci_dev-dev.config[0] = id  0xff;
+   pci_dev-dev.config[1] = (id  0xff00)  8;
+}
+fclose(f);
+
+/* read and fill vendor ID */
+snprintf(name, sizeof(name), %sdevice, dir);
+f = fopen(name, r);
+if (f == NULL) {
+fprintf(stderr, %s: %s: %m\n, __func__, name);
+return 1;
+}
+if (fscanf(f, %li\n, id) == 1) {
+   pci_dev-dev.config[2] = id  0xff;
+   pci_dev-dev.config[3] = (id  0xff00)  8;
+}
 fclose(f);
 
 dev-region_number = r;
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/16] Support for device capability

2009-03-12 Thread Sheng Yang
This framework can be easily extended to support device capability, like
MSI/MSI-x.

Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/pci.c |   77 +++-
 qemu/hw/pci.h |   29 +
 2 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index 821646c..eca0517 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -427,8 +427,8 @@ static void pci_update_mappings(PCIDevice *d)
 }
 }
 
-uint32_t pci_default_read_config(PCIDevice *d,
- uint32_t address, int len)
+static uint32_t pci_read_config(PCIDevice *d,
+uint32_t address, int len)
 {
 uint32_t val;
 
@@ -453,6 +453,45 @@ uint32_t pci_default_read_config(PCIDevice *d,
 return val;
 }
 
+static void pci_write_config(PCIDevice *pci_dev,
+ uint32_t address, uint32_t val, int len)
+{
+int i;
+for (i = 0; i  len; i++) {
+pci_dev-config[address + i] = val  0xff;
+val = 8;
+}
+}
+
+int pci_access_cap_config(PCIDevice *pci_dev, uint32_t address, int len)
+{
+if (pci_dev-cap.supported  address = pci_dev-cap.start 
+(address + len)  pci_dev-cap.start + pci_dev-cap.length)
+return 1;
+return 0;
+}
+
+uint32_t pci_default_cap_read_config(PCIDevice *pci_dev,
+ uint32_t address, int len)
+{
+return pci_read_config(pci_dev, address, len);
+}
+
+void pci_default_cap_write_config(PCIDevice *pci_dev,
+  uint32_t address, uint32_t val, int len)
+{
+pci_write_config(pci_dev, address, val, len);
+}
+
+uint32_t pci_default_read_config(PCIDevice *d,
+ uint32_t address, int len)
+{
+if (pci_access_cap_config(d, address, len))
+return d-cap.config_read(d, address, len);
+
+return pci_read_config(d, address, len);
+}
+
 void pci_default_write_config(PCIDevice *d,
   uint32_t address, uint32_t val, int len)
 {
@@ -485,6 +524,11 @@ void pci_default_write_config(PCIDevice *d,
 return;
 }
  default_config:
+if (pci_access_cap_config(d, address, len)) {
+d-cap.config_write(d, address, val, len);
+return;
+}
+
 /* not efficient, but simple */
 addr = address;
 for(i = 0; i  len; i++) {
@@ -905,3 +949,32 @@ PCIBus *pci_bridge_init(PCIBus *bus, int devfn, uint16_t 
vid, uint16_t did,
 s-bus = pci_register_secondary_bus(s-dev, map_irq);
 return s-bus;
 }
+
+int pci_enable_capability_support(PCIDevice *pci_dev,
+  uint32_t config_start,
+  PCICapConfigReadFunc *config_read,
+  PCICapConfigWriteFunc *config_write,
+  PCICapConfigInitFunc *config_init)
+{
+if (!pci_dev)
+return -ENODEV;
+
+if (config_start == 0)
+   pci_dev-cap.start = PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR;
+else if (config_start = 0x40  config_start  0xff)
+pci_dev-cap.start = config_start;
+else
+return -EINVAL;
+
+if (config_read)
+pci_dev-cap.config_read = config_read;
+else
+pci_dev-cap.config_read = pci_default_cap_read_config;
+if (config_write)
+pci_dev-cap.config_write = config_write;
+else
+pci_dev-cap.config_write = pci_default_cap_write_config;
+pci_dev-cap.supported = 1;
+pci_dev-config[PCI_CAPABILITY_LIST] = pci_dev-cap.start;
+return config_init(pci_dev);
+}
diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h
index 2327215..127dbed 100644
--- a/qemu/hw/pci.h
+++ b/qemu/hw/pci.h
@@ -139,6 +139,12 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int 
region_num,
 uint32_t addr, uint32_t size, int type);
 typedef int PCIUnregisterFunc(PCIDevice *pci_dev);
 
+typedef void PCICapConfigWriteFunc(PCIDevice *pci_dev,
+   uint32_t address, uint32_t val, int len);
+typedef uint32_t PCICapConfigReadFunc(PCIDevice *pci_dev,
+  uint32_t address, int len);
+typedef int PCICapConfigInitFunc(PCIDevice *pci_dev);
+
 #define PCI_ADDRESS_SPACE_MEM  0x00
 #define PCI_ADDRESS_SPACE_IO   0x01
 #define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08
@@ -197,6 +203,10 @@ typedef struct PCIIORegion {
 
 #define PCI_COMMAND_RESERVED_MASK_HI (PCI_COMMAND_RESERVED  8)
 
+#define PCI_CAPABILITY_CONFIG_MAX_LENGTH 0x60
+#define PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR 0x40
+#define PCI_CAPABILITY_CONFIG_MSI_LENGTH 0x10
+
 struct PCIDevice {
 /* PCI config space */
 uint8_t config[256];
@@ -219,6 +229,14 @@ struct PCIDevice {
 
 /* Current IRQ levels.  Used internally by the generic PCI code.  */
 int irq_state[4];
+
+/* Device capability configuration space */
+struct {
+int supported;
+unsigned int 

[PATCH 05/16] Figure out device capability

2009-03-12 Thread Sheng Yang
Try to figure out device capability in update_dev_cap(). Now we are only care
about MSI capability.

The function pci_find_cap_offset original function wrote by Allen for Xen.
Notice the function need root privilege to work. This depends on libpci to work.

Signed-off-by: Allen Kay allen.m@intel.com
Signed-off-by: Sheng Yang sh...@linux.intel.com
---
 qemu/hw/device-assignment.c |   29 +
 qemu/hw/device-assignment.h |1 +
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e8a69ba..a354681 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -219,6 +219,35 @@ static void assigned_dev_ioport_map(PCIDevice *pci_dev, 
int region_num,
   (r_dev-v_addrs + region_num));
 }
 
+static uint8_t pci_find_cap_offset(struct pci_dev *pci_dev, uint8_t cap)
+{
+int id;
+int max_cap = 48;
+int pos = PCI_CAPABILITY_LIST;
+int status;
+
+status = pci_read_byte(pci_dev, PCI_STATUS);
+if ((status  PCI_STATUS_CAP_LIST) == 0)
+return 0;
+
+while (max_cap--) {
+pos = pci_read_byte(pci_dev, pos);
+if (pos  0x40)
+break;
+
+pos = ~3;
+id = pci_read_byte(pci_dev, pos + PCI_CAP_LIST_ID);
+
+if (id == 0xff)
+break;
+if (id == cap)
+return pos;
+
+pos += PCI_CAP_LIST_NEXT;
+}
+return 0;
+}
+
 static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address,
   uint32_t val, int len)
 {
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index da775d7..0fd78de 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -29,6 +29,7 @@
 #define __DEVICE_ASSIGNMENT_H__
 
 #include sys/mman.h
+#include pci/pci.h
 #include qemu-common.h
 #include sys-queue.h
 #include pci.h
-- 
1.5.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-autotest -- introducing kvm_runtest_2

2009-03-12 Thread Ryan Harper
* Michael Goldish mgold...@redhat.com [2009-03-12 09:04]:
 
  
  yep, used stepeditor to fix; defintely worth documenting where one
  should be invoking stepeditor -- from the steps dir; if you don't run
  it from there, it won't find the steps_data dir =(
 
 Are you absolutely sure about that? That's not the way it's supposed
 to be. I tried running it on several machines and it worked every time
 regardless of where I invoked it from. Since it resides in the
 kvm_runtest_2 dir, I usually just change to that directory and type
 ./stepeditor.py. Then I use file-open and pick the steps file, and it
 works.

You're right, it was the stepfile that I opened since the data dir
variable is created from the name of the stepfile.

 
 If you have a very recent version, you should have a dir named
 steps_data under kvm_runtest_2, right next to steps. Inside
 steps_data you should have the data dirs. For steps/RHEL5.steps

I've got whatever is latest in the public repo.

 the corresponding data dir would be steps_data/RHEL5.steps_data/.
 If you have a slightly older version, you should have the data dirs
 inside the steps dir, next to the stepfiles themselves. For
 steps/RHEL5.steps, the corresponding data dir would be
 steps/RHEL5.steps_data/.
 
  I'll have to go back and re-read your email on where to put the
  reference ppm files so one gets the refrence comparision.
 
 The paragraph above applies to the reference comparison as well.

OK, cool.

  Right - I suppose it might be better if the names of the windows iso
  disks matched how MS names them in MSDN, for example, kvm_runtest
  refers
  to  Windows2008-x64.iso  which doesn't match any name from MSDN, what
  we
  have is:
  en_windows_server_2008_datacenter_enterprise_standard_x64_dvd_X14-26714.iso
 
 This is a very good idea. I wonder how we can find out the MSDN names
 of the ISOs we have.   BTW, did the ISO you mentioned work with
 kvm_runtest?

MSDN lists the md5 and maybe sha1 hashs for the isos on the website
where they are downloaded.

That iso works until the step where it needs to set the password for the
user, and as we've discussed, without the original ppm files, I can't
figure out why it fails to match that screen.


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: Improvements for task switching

2009-03-12 Thread Bernhard Kohl
NSN's proprietary OS DMX sometimes does task switches.
To get it running in KVM the following changes were necessary:
Interrupt injection only with interrupt flag set.
Linking the tss-prev_task_link to itself removed.
Task linking is required for CALL and GATE.
Do not call skip_emulated_instruction() for GATE.

Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com
---
 arch/x86/kvm/vmx.c |3 ++-
 arch/x86/kvm/x86.c |   19 +--
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5cf28df..eca57a3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
enable_irq_window(vcpu);
}
if (vcpu-arch.interrupt.pending) {
-   vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
+   if (vcpu-arch.interrupt_window_open)
+   vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
if (kvm_cpu_has_interrupt(vcpu))
enable_irq_window(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b556b6a..9052058 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3683,7 +3683,7 @@ static void save_state_to_tss32(struct kvm_vcpu *vcpu,
tss-fs = get_segment_selector(vcpu, VCPU_SREG_FS);
tss-gs = get_segment_selector(vcpu, VCPU_SREG_GS);
tss-ldt_selector = get_segment_selector(vcpu, VCPU_SREG_LDTR);
-   tss-prev_task_link = get_segment_selector(vcpu, VCPU_SREG_TR);
+   tss-prev_task_link = 0;
 }
 
 static int load_state_from_tss32(struct kvm_vcpu *vcpu,
@@ -3810,6 +3810,7 @@ out:
 
 static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector,
   u32 old_tss_base,
+  u16 old_tss_selector, int reason,
   struct desc_struct *nseg_desc)
 {
struct tss_segment_32 tss_segment_32;
@@ -3829,6 +3830,18 @@ static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16
tss_selector,
   tss_segment_32, sizeof tss_segment_32))
goto out;
 
+   /*
+* SDM 3: table 6-2
+* Task linking required for CALL and GATE.
+*/
+   if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE)
+   {
+   tss_segment_32.prev_task_link = old_tss_selector;
+   kvm_write_guest(vcpu-kvm, get_tss_base_addr(vcpu, nseg_desc),
+   tss_segment_32, sizeof(struct tss_segment_32));
+
+   }
+
if (load_state_from_tss32(vcpu, tss_segment_32))
goto out;
 
@@ -3882,10 +3895,12 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16
tss_selector, int reason)
kvm_x86_ops-set_rflags(vcpu, eflags  ~X86_EFLAGS_NT);
}
 
-   kvm_x86_ops-skip_emulated_instruction(vcpu);
+   if (reason != TASK_SWITCH_GATE)
+   kvm_x86_ops-skip_emulated_instruction(vcpu);
 
if (nseg_desc.type  8)
ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_base,
+old_tss_sel, reason,
 nseg_desc);
else
ret = kvm_task_switch_16(vcpu, tss_selector, old_tss_base,
-- 
1.6.0.6


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


missing kvm smp tlb flush in invlpg

2009-03-12 Thread Andrea Arcangeli
From: Andrea Arcangeli aarca...@redhat.com

While looking at invlpg out of sync code with Izik I think I noticed a
missing smp tlb flush here. Without this the other cpu can still write
to a freed host physical page. tlb smp flush must happen if
rmap_remove is called always before mmu_lock is released because the
VM will take the mmu_lock before it can finally add the page to the
freelist after swapout. mmu notifier makes it safe to flush the tlb
after freeing the page (otherwise it would never be safe) so we can do
a single flush for multiple sptes invalidated.

Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index a0c11ea..855eb71 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -445,6 +445,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
gpa_t pte_gpa = -1;
int level;
u64 *sptep;
+   int need_flush = 0;
 
spin_lock(vcpu-kvm-mmu_lock);
 
@@ -464,6 +465,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
rmap_remove(vcpu-kvm, sptep);
if (is_large_pte(*sptep))
--vcpu-kvm-stat.lpages;
+   need_flush = 1;
}
set_shadow_pte(sptep, shadow_trap_nonpresent_pte);
break;
@@ -473,6 +475,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
break;
}
 
+   if (need_flush)
+   kvm_flush_remote_tlbs(vcpu-kvm);
spin_unlock(vcpu-kvm-mmu_lock);
 
if (pte_gpa == -1)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: external module: support building against Windriver 2.0 (kernel 2.6.21)

2009-03-12 Thread Bernhard Kohl
This is needed to compile kvm in a Windriver 2.0 distribution
(kernel 2.6.21). This kernel has an include file marker.h, but
trace_mark is not defined there. So the compat code in
kernel/include-compat/linux/marker.h is not included.

Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com
---
 kernel/external-module-compat-comm.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/external-module-compat-comm.h
b/kernel/external-module-compat-comm.h
index a14cea2..e40501e 100644
--- a/kernel/external-module-compat-comm.h
+++ b/kernel/external-module-compat-comm.h
@@ -25,6 +25,10 @@
 #  undef CONFIG_KVM_TRACE
 #endif
 
+#if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,21)
+#define trace_mark(args...) ((void)0)
+#endif
+
 /*
  * 2.6.16 does not have GFP_NOWAIT
  */
-- 
1.6.0.6


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: external module: support building against Windriver 2.0 (kernel 2.6.21)

2009-03-12 Thread Jan Kiszka
Bernhard Kohl wrote:
 This is needed to compile kvm in a Windriver 2.0 distribution
 (kernel 2.6.21). This kernel has an include file marker.h, but
 trace_mark is not defined there. So the compat code in
 kernel/include-compat/linux/marker.h is not included.

I bet this is because Wind River patched some variant of LTTng into
their kernel.

However, I'm unsure if supporting significantly modified distribution
kernels is in the scope of this compat layer. If it is ok for the
maintainers, you should try to make the test more Wind River specific
(did you check that there is no side-effect for normal = 2.6.21
kernels?) and maybe add a comment.

Jan

 
 Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com
 ---
  kernel/external-module-compat-comm.h |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/kernel/external-module-compat-comm.h
 b/kernel/external-module-compat-comm.h
 index a14cea2..e40501e 100644
 --- a/kernel/external-module-compat-comm.h
 +++ b/kernel/external-module-compat-comm.h
 @@ -25,6 +25,10 @@
  #  undef CONFIG_KVM_TRACE
  #endif
  
 +#if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,21)
 +#define trace_mark(args...) ((void)0)
 +#endif
 +
  /*
   * 2.6.16 does not have GFP_NOWAIT
   */

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Improvements for task switching

2009-03-12 Thread Jan Kiszka
Bernhard Kohl wrote:
 NSN's proprietary OS DMX sometimes does task switches.
 To get it running in KVM the following changes were necessary:
 Interrupt injection only with interrupt flag set.
 Linking the tss-prev_task_link to itself removed.
 Task linking is required for CALL and GATE.
 Do not call skip_emulated_instruction() for GATE.

Please post independent changes as separate patches. I guess the task
linking changes belong together, but surely not to the IRQ injection
patch. And the last change looks independent, too.

Another wish (specifically as this is tricky stuff): also describe in
the commit log, why you changed something.

 
 Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com
 ---
  arch/x86/kvm/vmx.c |3 ++-
  arch/x86/kvm/x86.c |   19 +--
  2 files changed, 19 insertions(+), 3 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 5cf28df..eca57a3 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
   enable_irq_window(vcpu);
   }
   if (vcpu-arch.interrupt.pending) {
 - vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
 + if (vcpu-arch.interrupt_window_open)
 + vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
   if (kvm_cpu_has_interrupt(vcpu))
   enable_irq_window(vcpu);
   }

That causes concerns on my side as we had a hard time stabilizing this
code. Need to think about it. Do you happen to have a test case for this
(if it's not publicly shareable, contact me directly)? Did you check
that this change causes no obvious regressions to other guests? What
about the user-inject IRQ case, does it already work for you as-is?

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index b556b6a..9052058 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -3683,7 +3683,7 @@ static void save_state_to_tss32(struct kvm_vcpu *vcpu,
   tss-fs = get_segment_selector(vcpu, VCPU_SREG_FS);
   tss-gs = get_segment_selector(vcpu, VCPU_SREG_GS);
   tss-ldt_selector = get_segment_selector(vcpu, VCPU_SREG_LDTR);
 - tss-prev_task_link = get_segment_selector(vcpu, VCPU_SREG_TR);
 + tss-prev_task_link = 0;
  }
  
  static int load_state_from_tss32(struct kvm_vcpu *vcpu,
 @@ -3810,6 +3810,7 @@ out:
  
  static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector,
  u32 old_tss_base,
 +u16 old_tss_selector, int reason,
  struct desc_struct *nseg_desc)
  {
   struct tss_segment_32 tss_segment_32;

What about 16-bit switches, are they already correct?

 @@ -3829,6 +3830,18 @@ static int kvm_task_switch_32(struct kvm_vcpu *vcpu, 
 u16
 tss_selector,
  tss_segment_32, sizeof tss_segment_32))
   goto out;
  
 + /*
 +  * SDM 3: table 6-2
 +  * Task linking required for CALL and GATE.
 +  */
 + if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE)
 + {
 + tss_segment_32.prev_task_link = old_tss_selector;
 + kvm_write_guest(vcpu-kvm, get_tss_base_addr(vcpu, nseg_desc),
 + tss_segment_32, sizeof(struct tss_segment_32));
 +
 + }
 +
   if (load_state_from_tss32(vcpu, tss_segment_32))
   goto out;
  
 @@ -3882,10 +3895,12 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16
 tss_selector, int reason)
   kvm_x86_ops-set_rflags(vcpu, eflags  ~X86_EFLAGS_NT);
   }
  
 - kvm_x86_ops-skip_emulated_instruction(vcpu);
 + if (reason != TASK_SWITCH_GATE)
 + kvm_x86_ops-skip_emulated_instruction(vcpu);
  
   if (nseg_desc.type  8)
   ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_base,
 +  old_tss_sel, reason,
nseg_desc);
   else
   ret = kvm_task_switch_16(vcpu, tss_selector, old_tss_base,

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Improvements for task switching

2009-03-12 Thread Jan Kiszka
Jan Kiszka wrote:
 Bernhard Kohl wrote:
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index 5cf28df..eca57a3 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu)
  enable_irq_window(vcpu);
  }
  if (vcpu-arch.interrupt.pending) {
 -vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
 +if (vcpu-arch.interrupt_window_open)
 +vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr);
  if (kvm_cpu_has_interrupt(vcpu))
  enable_irq_window(vcpu);
  }
 
 That causes concerns on my side as we had a hard time stabilizing this
 code. Need to think about it. Do you happen to have a test case for this
 (if it's not publicly shareable, contact me directly)? Did you check
 that this change causes no obvious regressions to other guests? What
 about the user-inject IRQ case, does it already work for you as-is?

Hmm, do_interrupt_requests will most likely not cause troubles as it
both pends and injects interrupts only when the window if open. I don't
get the scenario behind this here yet, but I think it would be a very
good chance to align the code layout of vmx_intr_assist to
do_interrupt_requests in this respect, either finally de-optimizing or
even breaking both :) - or bringing them in the same correct form.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Unable to re-establish VNC connection after some time

2009-03-12 Thread Andreas Olsowski
Hello fellow kvm users/admins,
 im currently running kvm-84 with linux-2.6.28.4 under x86_64.
 My  2 linux guests are basically running the same kernel (minus iscsi, 
multipath and
 kvm support.
 Another guest is a win2008 server.

 All of the machines experience the same problem:

 After  a  while  i can't connect via VNC, UltraVNC as well as RealVNC
 experiencee Timeouts waiting for the server to respond.

 All machines show:
 (qemu) info vnc
  VNC server active on: 0.0.0.0:1
  Client connected

 It  looks  like  something  has  died there ... could it be because i
 didnt   disconnect  my  VNC Client properly?

 Even (qemu) system_reset, will not bring the screen back.

 I am open for suggestions.

-- 
Regards,
 Andreas Olsowski  
mailto:andreas.olsow...@uni-lueneburg.de
System- und Netzwerktechnik
Sysadmin extraordinaire

Leuphana Univerität Lüneburg
Scharnhorststraße 1
21335 Lüneburg

Tel: 04131 / 677-1309
Mobil: 0175 / 5720275

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: x86: use smp_send_reschedule in kvm_vcpu_kick

2009-03-12 Thread Zhang, Xiantao
We also hacked the source like the patch. But the issue is not caused by it. We 
are still trying to figure the reason out. Thanks!  
Xiantao

-Original Message-
From: Gleb Natapov [mailto:g...@redhat.com] 
Sent: Thursday, March 12, 2009 7:04 PM
To: Zhang, Xiantao
Cc: Avi Kivity; Marcelo Tosatti; Ingo Molnar; kvm@vger.kernel.org; Peter 
Zijlstra
Subject: Re: x86: use smp_send_reschedule in kvm_vcpu_kick

On Thu, Mar 12, 2009 at 10:31:47AM +0800, Zhang, Xiantao wrote:
 Avi Kivity wrote:
  Marcelo Tosatti wrote:
  OK, reworked patch:
  - change ia64 in addition to x86
  - add comment on smp send reschedule handlers about KVM's usage
  
  Untested on IA64.
  
  KVM: use smp_send_reschedule in kvm_vcpu_kick
  
  KVM uses a function call IPI to cause the exit of a guest running on
  a physical cpu. For virtual interrupt notification there is no need
  to wait on IPI receival, or to execute any function.
  
  This is exactly what the reschedule IPI does, without the overhead
  of function IPI. So use it instead of smp_call_function_single in
  kvm_vcpu_kick. 
  
  Also change the guest_mode variable to a bit in vcpu-requests, and
  use that to collapse multiple IPI's that would be issued between the
  first one and zeroing of guest mode.
  
  This allows kvm_vcpu_kick to called with interrupts disabled.
  
  
  Looks good. Will wait for Xiantao's test-n-ack before applying.
 
 kvm-ia64 is broken due to recent check-ins about irq-bits, and we are trying 
 to fix it. For this patch, ia64 has to export the symbol smp_send_reschedule 
 before applying the patch.
Can you try this patch please: http://patchwork.kernel.org/patch/11103/

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html