[PATCH] KVM: make 'lapic_timer_ops' and 'kpit_ops' static
From: Hannes Eder han...@hanneseder.net Fix this sparse warnings: arch/x86/kvm/lapic.c:916:22: warning: symbol 'lapic_timer_ops' was not declared. Should it be static? arch/x86/kvm/i8254.c:268:22: warning: symbol 'kpit_ops' was not declared. Should it be static? Signed-off-by: Hannes Eder han...@hanneseder.net Signed-off-by: Avi Kivity a...@redhat.com diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 4e2e3f2..cf09bb6 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -265,7 +265,7 @@ static bool kpit_is_periodic(struct kvm_timer *ktimer) return ps-is_periodic; } -struct kvm_timer_ops kpit_ops = { +static struct kvm_timer_ops kpit_ops = { .is_periodic = kpit_is_periodic, }; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index dd934d2..4d76bb6 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -913,7 +913,7 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } -struct kvm_timer_ops lapic_timer_ops = { +static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with virtio_net in multiple guests?
Ken Robertson wrote: Hoping someone can help me track down an issue I'm experiencing on a KVM machine I built recently. ... SIOCSIFFLAGS: Cannot assign requested address The address isn't in use or anything, so no reason I can think of why it can't assign it. It recognizes the device, however can't bring it up. All the VMs have unique MAC addresses, randomly generated. One of the ones that doesn't work is using 93:01:dc:a0:f0:57. That MAC address is not valid. The LSB of the first byte should be 0 to indicate unicast, and the second LSB of the first byte should be 1 to indicate a locally-assigned address. e.g. 92:01:dc:a0:f0:57 should work. -jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with virtio_net in multiple guests?
Jim, That was it! I didn't realize there was some significance of certain bits within the address. Changing that first byte resolved the issue. Should I be setting the 2nd bit in the LSB to 1? I started logging into all the systems I have access to and realized all of them have 00 as the first byte in the address. Should I just stick to 00 on all mine? Or by making that 2nd bit 1, does that force the card to inherit an address from libvirt or somewhere else instead of the VM configuration? I'll play around with it some more, but at least that mystery is solved. BTW, was the first time posting on this list and love it! Quickest response I've ever gotten on a mailing list. :) Thanks! Ken On Wed, Mar 11, 2009 at 11:28 PM, Jim Paris j...@jtan.com wrote: Ken Robertson wrote: Hoping someone can help me track down an issue I'm experiencing on a KVM machine I built recently. ... SIOCSIFFLAGS: Cannot assign requested address The address isn't in use or anything, so no reason I can think of why it can't assign it. It recognizes the device, however can't bring it up. All the VMs have unique MAC addresses, randomly generated. One of the ones that doesn't work is using 93:01:dc:a0:f0:57. That MAC address is not valid. The LSB of the first byte should be 0 to indicate unicast, and the second LSB of the first byte should be 1 to indicate a locally-assigned address. e.g. 92:01:dc:a0:f0:57 should work. -jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Issues with virtio_net in multiple guests?
Ken Robertson wrote: Jim, That was it! I didn't realize there was some significance of certain bits within the address. Changing that first byte resolved the issue. Should I be setting the 2nd bit in the LSB to 1? I started logging into all the systems I have access to and realized all of them have 00 as the first byte in the address. Should I just stick to 00 on all mine? Or by making that 2nd bit 1, does that force the card to inherit an address from libvirt or somewhere else instead of the VM configuration? I'll play around with it some more, but at least that mystery is solved. The 2nd LSB of the first byte just says it's a locally generated address rather than one of the officially-assigned OUIs. The 1st LSB of the first byte is the important one as that's what the Linux kernel checks and rejects if it's set (it also rejects an address of all zeros). -jim -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm-autotest -- introducing kvm_runtest_2
* Michael Goldish mgold...@redhat.com [2009-03-12 02:26]: Regarding the stepfiles you created for Linux -- I can't help much with those since I don't have the data. I do believe that if I had the data and the stepfiles I could quickly identify the problem, so if you think those can be sent to us, I'd like to have them. I created a stepfile for RHEL5 and what I'm seeing is that one of the screens I captured in stepmaker ended up having a focus ring around something and on replay the focus isn't there. This situation isn't something that a new algo will fix as you pointed out. I'm wondering if this is something you've seen. I don't quite understand how it would happen since stepmaker and the replace send the same keystrokes. I also don't see how in general this can be avoided. The problem sounds familiar. Does the ring appear around one of the GNOME menubars, i.e. around Applications or System? GNOME seems to be somewhat indeterministic with those rings. If you run the stepfile several times, you'll notice that in most cases you'll see a focus ring (or no focus ring, I don't quite remember) and the rest of the time you'll get the other case. Ding Ding Ding! =) This can be avoided either with experience, or a good wiki entry on picking the right barriers (which we plan to create). But you don't have to avoid making mistakes with stepmaker -- most types of mistakes are fixed very quickly and easily with stepeditor. yep, used stepeditor to fix; defintely worth documenting where one should be invoking stepeditor -- from the steps dir; if you don't run it from there, it won't find the steps_data dir =( The fix depends on exactly what you were trying to do: - If you sent alt-f1 to open the menu, and in the following step picked the open menu (including the Applications caption itself) to make sure it was open -- use stepeditor to modify the barrier so that it doesn't include the Applications caption or anything that might have a ring around it. That worked for me. The following text was copied from your previous e-mail: I do have the debug dir data from these runs. Looking at the cropped ppm and screendump ppm is how I determined that there must be something wrong with how the image is rendered since the cropped ppm matches the screendump output, but with whatever subtle difference that generates a different md5sum. I'm not sure my previous e-mail was clear enough, so just in case it wasn't, let me rephrase: The cropped ppm is generated from the screendump ppm every time the stepfile running module receives a screendump from the guest in order to see if it matches a barrier. This is done for debugging purposes. If you somehow check, you'll see there is no subtle difference between those two files. It wouldn't make sense to find a subtle difference between them, and if you did find one, it certainly wouldn't indicate a stepfile problem, but rather a very strange bug in the framework. You should be looking for subtle differences between the screendump ppm and the reference screendump ppm, as well as between the cropped screendump ppm and the reference cropped screendump ppm. By reference I mean coming from the stepmaker data. If you don't have the stepmaker data, you have no way of knowing what caused the difference in the md5sums. Right -- the real win was comparing the full screendump to the reference screendump - basically, without the reference dumps, the debug output isn't useful. I'll have to go back and re-read your email on where to put the reference ppm files so one gets the refrence comparision. There are two other things I forgot to mention in my previous e-mail: The Windows failures you're seeing might be caused by KVM bugs other than the one I mentioned. KVM-84 has a very strong tendency to crash during Windows installations. You can use the logs to find out if that happened in your case. If you have the latest git HEAD the exception info will look something like Barrier timed out at step ... (VM is dead), and if you have a slightly older version, you'll probably see (guest is stuck) at the end of the info string. You should also see the system consistently complaining that it can't fetch any screendumps from the guest (this will appear in stdout). I've seen those on kvm-84. The other thing has to do with the ISO files. kvm_runtest has a very important feature that we innocently forgot to implement in kvm_runtest_2 -- md5sum verification of the ISO files. This means that the framework currently makes no use of the md5sum and md5sum_1m parameters in the config file. This means you might be using different ISOs than the ones we made our stepfiles with. In that case I wouldn't expect any stepfile to succeed. However, if you used these same ISOs with kvm_runtest then they should be fine. In any case, I'll add the feature ASAP to the git repository. Right - I
Re: Problem with KVM-84 and more than 4 processors
Matthias Hovestadt wrote: Hi! I am unable to reproduce - 'modprobe kvm' gets me the expected lsmod line. Can you reproduce with plain 2.6.27 instead of the gentoo build? No, with a vanilla kernel from kernel.org it seems to work fine. Please take it up with the gentoo kernel team then. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] compile fix - avoid raw string literal
This patch fixes compilation problems of kvm-userspace on current gcc 4.4 compilers which implement the following standard: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm Signed-off-by: Jochen Roth jr...@linux.vnet.ibm.com diff --git a/user/test/x86/apic.c b/user/test/x86/apic.c index 9c6205b..7794615 100644 --- a/user/test/x86/apic.c +++ b/user/test/x86/apic.c @@ -54,14 +54,14 @@ asm ( push %r9 \n\t push %r8 \n\t #endif -push %Rdi \n\t -push %Rsi \n\t -push %Rbp \n\t -push %Rsp \n\t -push %Rbx \n\t -push %Rdx \n\t -push %Rcx \n\t -push %Rax \n\t +push %R di \n\t +push %R si \n\t +push %R bp \n\t +push %R sp \n\t +push %R bx \n\t +push %R dx \n\t +push %R cx \n\t +push %R ax \n\t #ifdef __x86_64__ mov %rsp, %rdi \n\t callq *8*16(%rsp) \n\t @@ -70,14 +70,14 @@ asm ( calll *4+4*8(%esp) \n\t add $4, %esp \n\t #endif -pop %Rax \n\t -pop %Rcx \n\t -pop %Rdx \n\t -pop %Rbx \n\t -pop %Rbp \n\t -pop %Rbp \n\t -pop %Rsi \n\t -pop %Rdi \n\t +pop %R ax \n\t +pop %R cx \n\t +pop %R dx \n\t +pop %R bx \n\t +pop %R bp \n\t +pop %R bp \n\t +pop %R si \n\t +pop %R di \n\t #ifdef __x86_64__ pop %r8 \n\t pop %r9 \n\t diff --git a/user/test/x86/vmexit.c b/user/test/x86/vmexit.c index bd57bfa..981d6c1 100644 --- a/user/test/x86/vmexit.c +++ b/user/test/x86/vmexit.c @@ -31,7 +31,7 @@ int main() t1 = rdtsc(); for (i = 0; i N; ++i) - asm volatile (push %%Rbx; cpuid; pop %%Rbx + asm volatile (push %%R bx; cpuid; pop %%R bx : : : eax, ecx, edx); t2 = rdtsc(); printf(vmexit latency: %d\n, (int)((t2 - t1) / N)); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/16] kvm: deassign irq for INTx
From: Marcelo Tosatti mtosa...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/device-assignment.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index 7c73210..19848b4 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -536,6 +536,14 @@ static int assign_irq(AssignedDevInfo *adev) calc_assigned_dev_id(dev-h_busnr, dev-h_devfn); assigned_irq_data.guest_irq = irq; assigned_irq_data.host_irq = dev-real_device.irq; +#ifdef KVM_CAP_ASSIGN_DEV_IRQ +assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | KVM_DEV_IRQ_GUEST_INTX; +r = kvm_deassign_irq(kvm_context, assigned_irq_data); +/* -ENXIO means no assigned irq */ +if (r r != -ENXIO) +perror(assign_irq: deassign); +#endif + r = kvm_assign_irq(kvm_context, assigned_irq_data); if (r 0) { fprintf(stderr, Failed to assign irq for \%s\: %s\n, -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/16] kvm: Support MSI convert to INTx in device assignment
Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/device-assignment.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index bda0e95..01485d7 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -588,7 +588,11 @@ static int assign_irq(AssignedDevInfo *adev) assigned_irq_data.guest_irq = irq; assigned_irq_data.host_irq = dev-real_device.irq; #ifdef KVM_CAP_ASSIGN_DEV_IRQ -assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | KVM_DEV_IRQ_GUEST_INTX; +if (dev-cap.available ASSIGNED_DEVICE_CAP_MSI) +assigned_irq_data.flags = KVM_DEV_IRQ_HOST_MSI | KVM_DEV_IRQ_GUEST_INTX; +else +assigned_irq_data.flags = KVM_DEV_IRQ_HOST_INTX | KVM_DEV_IRQ_GUEST_INTX; + r = kvm_deassign_irq(kvm_context, assigned_irq_data); /* -ENXIO means no assigned irq */ if (r r != -ENXIO) -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/16] Add MSI-X related macro to pci.c
Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/pci.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h index 127dbed..1392626 100644 --- a/qemu/hw/pci.h +++ b/qemu/hw/pci.h @@ -206,6 +206,7 @@ typedef struct PCIIORegion { #define PCI_CAPABILITY_CONFIG_MAX_LENGTH 0x60 #define PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR 0x40 #define PCI_CAPABILITY_CONFIG_MSI_LENGTH 0x10 +#define PCI_CAPABILITY_CONFIG_MSIX_LENGTH 0x10 struct PCIDevice { /* PCI config space */ -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/16 v5] Device assignment improvement in userspace
Patch 1 and 2 are new ones, all the others had been sent before. This (huge) patchset, contained: Patch 1..2 are new interface after reworked device assignment kernel part. Patch 3..6 are generic capability support mechanism. These may can be adopted by QEmu upstream as well. Patch 7..10 enable MSI with device assignment on KVM. Also due to reworked device assignment kernel part discard MSI convert to INTx mechanism, patch 10 enable it again in userspace. Patch 11..13 enable MSI-X with device assignment on KVM. And Patch 14..16 enable SR-IOV with KVM. Update from latest series: 1. Convert to the new ioctl interface. 2. Merge capability configuration space with PCIDevice one. 3. Support of deassign IRQ(unload driver) with MSI/MSI-X better. 4. Not assume IRQ0 means no INTx any longer, but check interrupt pin field in configuration space for the judgment. Please help to review! Thanks! -- libkvm/kvm-common.h |1 + libkvm/libkvm.c | 176 +-- libkvm/libkvm.h | 58 +- qemu/Makefile.target|1 + qemu/configure | 20 ++ qemu/hw/device-assignment.c | 526 +-- qemu/hw/device-assignment.h | 17 ++ qemu/hw/pci.c | 77 ++- qemu/hw/pci.h | 38 +++ 9 files changed, 871 insertions(+), 43 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/16] kvm: user interface for MSI type irq routing
Signed-off-by: Sheng Yang sh...@linux.intel.com --- libkvm/libkvm.c | 98 --- libkvm/libkvm.h | 22 2 files changed, 101 insertions(+), 19 deletions(-) diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c index 80a0481..e9bae23 100644 --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -1265,11 +1265,12 @@ int kvm_clear_gsi_routes(kvm_context_t kvm) #endif } -int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) +int kvm_add_routing_entry(kvm_context_t kvm, + struct kvm_irq_routing_entry* entry) { #ifdef KVM_CAP_IRQ_ROUTING struct kvm_irq_routing *z; - struct kvm_irq_routing_entry *e; + struct kvm_irq_routing_entry *new; int n, size; if (kvm-irq_routes-nr == kvm-nr_allocated_irq_routes) { @@ -1277,7 +1278,7 @@ int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) if (n 64) n = 64; size = sizeof(struct kvm_irq_routing); - size += n * sizeof(*e); + size += n * sizeof(*new); z = realloc(kvm-irq_routes, size); if (!z) return -ENOMEM; @@ -1285,34 +1286,77 @@ int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) kvm-irq_routes = z; } n = kvm-irq_routes-nr++; - e = kvm-irq_routes-entries[n]; - memset(e, 0, sizeof(*e)); - e-gsi = gsi; - e-type = KVM_IRQ_ROUTING_IRQCHIP; - e-flags = 0; - e-u.irqchip.irqchip = irqchip; - e-u.irqchip.pin = pin; + new = kvm-irq_routes-entries[n]; + memset(new, 0, sizeof(*new)); + new-gsi = entry-gsi; + new-type = entry-type; + new-flags = entry-flags; + new-u = entry-u; return 0; #else return -ENOSYS; #endif } -int kvm_del_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) +int kvm_add_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) +{ +#ifdef KVM_CAP_IRQ_ROUTING + struct kvm_irq_routing_entry e; + + e.gsi = gsi; + e.type = KVM_IRQ_ROUTING_IRQCHIP; + e.flags = 0; + e.u.irqchip.irqchip = irqchip; + e.u.irqchip.pin = pin; + return kvm_add_routing_entry(kvm, e); +#else + return -ENOSYS; +#endif +} + +int kvm_del_routing_entry(kvm_context_t kvm, + struct kvm_irq_routing_entry* entry) { #ifdef KVM_CAP_IRQ_ROUTING struct kvm_irq_routing_entry *e, *p; - int i; + int i, found = 0; for (i = 0; i kvm-irq_routes-nr; ++i) { e = kvm-irq_routes-entries[i]; - if (e-type == KVM_IRQ_ROUTING_IRQCHIP -e-gsi == gsi -e-u.irqchip.irqchip == irqchip -e-u.irqchip.pin == pin) { - p = kvm-irq_routes-entries[--kvm-irq_routes-nr]; - *e = *p; - return 0; + if (e-type == entry-type +e-gsi == entry-gsi) { + switch (e-type) + { + case KVM_IRQ_ROUTING_IRQCHIP: { + if (e-u.irqchip.irqchip == + entry-u.irqchip.irqchip +e-u.irqchip.pin == + entry-u.irqchip.pin) { + p = kvm-irq_routes- + entries[--kvm-irq_routes-nr]; + *e = *p; + found = 1; + } + break; + } + case KVM_IRQ_ROUTING_MSI: { + if (e-u.msi.address_lo == + entry-u.msi.address_lo +e-u.msi.address_hi == + entry-u.msi.address_hi +e-u.msi.data == entry-u.msi.data) { + p = kvm-irq_routes- + entries[--kvm-irq_routes-nr]; + *e = *p; + found = 1; + } + break; + } + default: + break; + } + if (found) + return 0; } } return -ESRCH; @@ -1321,6 +1365,22 @@ int kvm_del_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) #endif } +int kvm_del_irq_route(kvm_context_t kvm, int gsi, int irqchip, int pin) +{ +#ifdef KVM_CAP_IRQ_ROUTING + struct kvm_irq_routing_entry e; + + e.gsi = gsi; +
[PATCH 03/16] kvm: Replace force type convert with container_of()
Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/device-assignment.c | 20 1 files changed, 12 insertions(+), 8 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index 19848b4..e8a69ba 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -144,7 +144,7 @@ static uint32_t assigned_dev_ioport_readl(void *opaque, uint32_t addr) static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, uint32_t e_phys, uint32_t e_size, int type) { -AssignedDevice *r_dev = (AssignedDevice *) pci_dev; +AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev); AssignedDevRegion *region = r_dev-v_addrs[region_num]; uint32_t old_ephys = region-e_physbase; uint32_t old_esize = region-e_size; @@ -175,7 +175,7 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num, uint32_t addr, uint32_t size, int type) { -AssignedDevice *r_dev = (AssignedDevice *) pci_dev; +AssignedDevice *r_dev = container_of(pci_dev, AssignedDevice, dev); AssignedDevRegion *region = r_dev-v_addrs[region_num]; int first_map = (region-e_size == 0); CPUState *env; @@ -224,6 +224,7 @@ static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address, { int fd; ssize_t ret; +AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev); DEBUG((%x.%x): address=%04x val=0x%08x len=%d\n, ((d-devfn 3) 0x1F), (d-devfn 0x7), @@ -245,7 +246,7 @@ static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address, ((d-devfn 3) 0x1F), (d-devfn 0x7), (uint16_t) address, val, len); -fd = ((AssignedDevice *)d)-real_device.config_fd; +fd = pci_dev-real_device.config_fd; again: ret = pwrite(fd, val, len, address); @@ -266,6 +267,7 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, uint32_t address, uint32_t val = 0; int fd; ssize_t ret; +AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev); if ((address = 0x10 address = 0x24) || address == 0x34 || address == 0x3c || address == 0x3d) { @@ -279,7 +281,7 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, uint32_t address, if (address == 0xFC) goto do_log; -fd = ((AssignedDevice *)d)-real_device.config_fd; +fd = pci_dev-real_device.config_fd; again: ret = pread(fd, val, len, address); @@ -618,15 +620,17 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus) { int r; AssignedDevice *dev; +PCIDevice *pci_dev; uint8_t e_device, e_intx; DEBUG(Registering real physical device %s (bus=%x dev=%x func=%x)\n, adev-name, adev-bus, adev-dev, adev-func); -dev = (AssignedDevice *) -pci_register_device(bus, adev-name, sizeof(AssignedDevice), --1, assigned_dev_pci_read_config, -assigned_dev_pci_write_config); +pci_dev = pci_register_device(bus, adev-name, + sizeof(AssignedDevice), -1, assigned_dev_pci_read_config, + assigned_dev_pci_write_config); +dev = container_of(pci_dev, AssignedDevice, dev); + if (NULL == dev) { fprintf(stderr, %s: Error: Couldn't register real device %s\n, __func__, adev-name); -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/16] kvm: add ioctl KVM_SET_MSIX_ENTRY_NR and KVM_SET_MSIX_ENTRY
Signed-off-by: Sheng Yang sh...@linux.intel.com --- libkvm/libkvm.c | 25 + libkvm/libkvm.h |7 +++ 2 files changed, 32 insertions(+), 0 deletions(-) diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c index 405b0bf..f8129a4 100644 --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -1410,3 +1410,28 @@ int kvm_get_irq_route_gsi(kvm_context_t kvm) return KVM_IOAPIC_NUM_PINS; } +#ifdef KVM_CAP_DEVICE_MSIX +int kvm_assign_set_msix_nr(kvm_context_t kvm, + struct kvm_assigned_msix_nr *msix_nr) +{ +int ret; + +ret = ioctl(kvm-vm_fd, KVM_ASSIGN_SET_MSIX_NR, msix_nr); +if (ret 0) +return -errno; + +return ret; +} + +int kvm_assign_set_msix_entry(kvm_context_t kvm, + struct kvm_assigned_msix_entry *entry) +{ +int ret; + +ret = ioctl(kvm-vm_fd, KVM_ASSIGN_SET_MSIX_ENTRY, entry); +if (ret 0) +return -errno; + +return ret; +} +#endif diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h index 9a7cbc6..d3e431a 100644 --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -854,4 +854,11 @@ int kvm_commit_irq_routes(kvm_context_t kvm); * \param kvm Pointer to the current kvm_context */ int kvm_get_irq_route_gsi(kvm_context_t kvm); + +#ifdef KVM_CAP_DEVICE_MSIX +int kvm_assign_set_msix_nr(kvm_context_t kvm, + struct kvm_assigned_msix_nr *msix_nr); +int kvm_assign_set_msix_entry(kvm_context_t kvm, + struct kvm_assigned_msix_entry *entry); +#endif #endif -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/16] KVM: Fill config with correct VID/DID
SRIOV's virtual function didn't show correct Vendor ID/Device ID in config, so we have to fill them manually according to device/vendor file in sysfs. Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/device-assignment.c | 31 ++- 1 files changed, 30 insertions(+), 1 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index 69f8e3a..ea67ce9 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -317,7 +317,8 @@ static uint32_t assigned_dev_pci_read_config(PCIDevice *d, uint32_t address, ssize_t ret; AssignedDevice *pci_dev = container_of(d, AssignedDevice, dev); -if ((address = 0x10 address = 0x24) || address == 0x34 || +if (address 0x4 || + (address = 0x10 address = 0x24) || address == 0x34 || address == 0x3c || address == 0x3d || pci_access_cap_config(d, address, len)) { val = pci_default_read_config(d, address, len); @@ -429,6 +430,7 @@ static int get_real_device(AssignedDevice *pci_dev, uint8_t r_bus, int fd, r = 0; FILE *f; unsigned long long start, end, size, flags; +unsigned long id; PCIRegion *rp; PCIDevRegions *dev = pci_dev-real_device; @@ -488,6 +490,33 @@ again: DEBUG(region %d size %d start 0x%llx type %d resource_fd %d\n, r, rp-size, start, rp-type, rp-resource_fd); } + +fclose(f); + +/* read and fill device ID */ +snprintf(name, sizeof(name), %svendor, dir); +f = fopen(name, r); +if (f == NULL) { +fprintf(stderr, %s: %s: %m\n, __func__, name); +return 1; +} +if (fscanf(f, %li\n, id) == 1) { + pci_dev-dev.config[0] = id 0xff; + pci_dev-dev.config[1] = (id 0xff00) 8; +} +fclose(f); + +/* read and fill vendor ID */ +snprintf(name, sizeof(name), %sdevice, dir); +f = fopen(name, r); +if (f == NULL) { +fprintf(stderr, %s: %s: %m\n, __func__, name); +return 1; +} +if (fscanf(f, %li\n, id) == 1) { + pci_dev-dev.config[2] = id 0xff; + pci_dev-dev.config[3] = (id 0xff00) 8; +} fclose(f); dev-region_number = r; -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/16] Support for device capability
This framework can be easily extended to support device capability, like MSI/MSI-x. Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/pci.c | 77 +++- qemu/hw/pci.h | 29 + 2 files changed, 104 insertions(+), 2 deletions(-) diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c index 821646c..eca0517 100644 --- a/qemu/hw/pci.c +++ b/qemu/hw/pci.c @@ -427,8 +427,8 @@ static void pci_update_mappings(PCIDevice *d) } } -uint32_t pci_default_read_config(PCIDevice *d, - uint32_t address, int len) +static uint32_t pci_read_config(PCIDevice *d, +uint32_t address, int len) { uint32_t val; @@ -453,6 +453,45 @@ uint32_t pci_default_read_config(PCIDevice *d, return val; } +static void pci_write_config(PCIDevice *pci_dev, + uint32_t address, uint32_t val, int len) +{ +int i; +for (i = 0; i len; i++) { +pci_dev-config[address + i] = val 0xff; +val = 8; +} +} + +int pci_access_cap_config(PCIDevice *pci_dev, uint32_t address, int len) +{ +if (pci_dev-cap.supported address = pci_dev-cap.start +(address + len) pci_dev-cap.start + pci_dev-cap.length) +return 1; +return 0; +} + +uint32_t pci_default_cap_read_config(PCIDevice *pci_dev, + uint32_t address, int len) +{ +return pci_read_config(pci_dev, address, len); +} + +void pci_default_cap_write_config(PCIDevice *pci_dev, + uint32_t address, uint32_t val, int len) +{ +pci_write_config(pci_dev, address, val, len); +} + +uint32_t pci_default_read_config(PCIDevice *d, + uint32_t address, int len) +{ +if (pci_access_cap_config(d, address, len)) +return d-cap.config_read(d, address, len); + +return pci_read_config(d, address, len); +} + void pci_default_write_config(PCIDevice *d, uint32_t address, uint32_t val, int len) { @@ -485,6 +524,11 @@ void pci_default_write_config(PCIDevice *d, return; } default_config: +if (pci_access_cap_config(d, address, len)) { +d-cap.config_write(d, address, val, len); +return; +} + /* not efficient, but simple */ addr = address; for(i = 0; i len; i++) { @@ -905,3 +949,32 @@ PCIBus *pci_bridge_init(PCIBus *bus, int devfn, uint16_t vid, uint16_t did, s-bus = pci_register_secondary_bus(s-dev, map_irq); return s-bus; } + +int pci_enable_capability_support(PCIDevice *pci_dev, + uint32_t config_start, + PCICapConfigReadFunc *config_read, + PCICapConfigWriteFunc *config_write, + PCICapConfigInitFunc *config_init) +{ +if (!pci_dev) +return -ENODEV; + +if (config_start == 0) + pci_dev-cap.start = PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR; +else if (config_start = 0x40 config_start 0xff) +pci_dev-cap.start = config_start; +else +return -EINVAL; + +if (config_read) +pci_dev-cap.config_read = config_read; +else +pci_dev-cap.config_read = pci_default_cap_read_config; +if (config_write) +pci_dev-cap.config_write = config_write; +else +pci_dev-cap.config_write = pci_default_cap_write_config; +pci_dev-cap.supported = 1; +pci_dev-config[PCI_CAPABILITY_LIST] = pci_dev-cap.start; +return config_init(pci_dev); +} diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h index 2327215..127dbed 100644 --- a/qemu/hw/pci.h +++ b/qemu/hw/pci.h @@ -139,6 +139,12 @@ typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, uint32_t addr, uint32_t size, int type); typedef int PCIUnregisterFunc(PCIDevice *pci_dev); +typedef void PCICapConfigWriteFunc(PCIDevice *pci_dev, + uint32_t address, uint32_t val, int len); +typedef uint32_t PCICapConfigReadFunc(PCIDevice *pci_dev, + uint32_t address, int len); +typedef int PCICapConfigInitFunc(PCIDevice *pci_dev); + #define PCI_ADDRESS_SPACE_MEM 0x00 #define PCI_ADDRESS_SPACE_IO 0x01 #define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08 @@ -197,6 +203,10 @@ typedef struct PCIIORegion { #define PCI_COMMAND_RESERVED_MASK_HI (PCI_COMMAND_RESERVED 8) +#define PCI_CAPABILITY_CONFIG_MAX_LENGTH 0x60 +#define PCI_CAPABILITY_CONFIG_DEFAULT_START_ADDR 0x40 +#define PCI_CAPABILITY_CONFIG_MSI_LENGTH 0x10 + struct PCIDevice { /* PCI config space */ uint8_t config[256]; @@ -219,6 +229,14 @@ struct PCIDevice { /* Current IRQ levels. Used internally by the generic PCI code. */ int irq_state[4]; + +/* Device capability configuration space */ +struct { +int supported; +unsigned int
[PATCH 05/16] Figure out device capability
Try to figure out device capability in update_dev_cap(). Now we are only care about MSI capability. The function pci_find_cap_offset original function wrote by Allen for Xen. Notice the function need root privilege to work. This depends on libpci to work. Signed-off-by: Allen Kay allen.m@intel.com Signed-off-by: Sheng Yang sh...@linux.intel.com --- qemu/hw/device-assignment.c | 29 + qemu/hw/device-assignment.h |1 + 2 files changed, 30 insertions(+), 0 deletions(-) diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index e8a69ba..a354681 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -219,6 +219,35 @@ static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num, (r_dev-v_addrs + region_num)); } +static uint8_t pci_find_cap_offset(struct pci_dev *pci_dev, uint8_t cap) +{ +int id; +int max_cap = 48; +int pos = PCI_CAPABILITY_LIST; +int status; + +status = pci_read_byte(pci_dev, PCI_STATUS); +if ((status PCI_STATUS_CAP_LIST) == 0) +return 0; + +while (max_cap--) { +pos = pci_read_byte(pci_dev, pos); +if (pos 0x40) +break; + +pos = ~3; +id = pci_read_byte(pci_dev, pos + PCI_CAP_LIST_ID); + +if (id == 0xff) +break; +if (id == cap) +return pos; + +pos += PCI_CAP_LIST_NEXT; +} +return 0; +} + static void assigned_dev_pci_write_config(PCIDevice *d, uint32_t address, uint32_t val, int len) { diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h index da775d7..0fd78de 100644 --- a/qemu/hw/device-assignment.h +++ b/qemu/hw/device-assignment.h @@ -29,6 +29,7 @@ #define __DEVICE_ASSIGNMENT_H__ #include sys/mman.h +#include pci/pci.h #include qemu-common.h #include sys-queue.h #include pci.h -- 1.5.4.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm-autotest -- introducing kvm_runtest_2
* Michael Goldish mgold...@redhat.com [2009-03-12 09:04]: yep, used stepeditor to fix; defintely worth documenting where one should be invoking stepeditor -- from the steps dir; if you don't run it from there, it won't find the steps_data dir =( Are you absolutely sure about that? That's not the way it's supposed to be. I tried running it on several machines and it worked every time regardless of where I invoked it from. Since it resides in the kvm_runtest_2 dir, I usually just change to that directory and type ./stepeditor.py. Then I use file-open and pick the steps file, and it works. You're right, it was the stepfile that I opened since the data dir variable is created from the name of the stepfile. If you have a very recent version, you should have a dir named steps_data under kvm_runtest_2, right next to steps. Inside steps_data you should have the data dirs. For steps/RHEL5.steps I've got whatever is latest in the public repo. the corresponding data dir would be steps_data/RHEL5.steps_data/. If you have a slightly older version, you should have the data dirs inside the steps dir, next to the stepfiles themselves. For steps/RHEL5.steps, the corresponding data dir would be steps/RHEL5.steps_data/. I'll have to go back and re-read your email on where to put the reference ppm files so one gets the refrence comparision. The paragraph above applies to the reference comparison as well. OK, cool. Right - I suppose it might be better if the names of the windows iso disks matched how MS names them in MSDN, for example, kvm_runtest refers to Windows2008-x64.iso which doesn't match any name from MSDN, what we have is: en_windows_server_2008_datacenter_enterprise_standard_x64_dvd_X14-26714.iso This is a very good idea. I wonder how we can find out the MSDN names of the ISOs we have. BTW, did the ISO you mentioned work with kvm_runtest? MSDN lists the md5 and maybe sha1 hashs for the isos on the website where they are downloaded. That iso works until the step where it needs to set the password for the user, and as we've discussed, without the original ppm files, I can't figure out why it fails to match that screen. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: Improvements for task switching
NSN's proprietary OS DMX sometimes does task switches. To get it running in KVM the following changes were necessary: Interrupt injection only with interrupt flag set. Linking the tss-prev_task_link to itself removed. Task linking is required for CALL and GATE. Do not call skip_emulated_instruction() for GATE. Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com --- arch/x86/kvm/vmx.c |3 ++- arch/x86/kvm/x86.c | 19 +-- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 5cf28df..eca57a3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) enable_irq_window(vcpu); } if (vcpu-arch.interrupt.pending) { - vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); + if (vcpu-arch.interrupt_window_open) + vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); if (kvm_cpu_has_interrupt(vcpu)) enable_irq_window(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b556b6a..9052058 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3683,7 +3683,7 @@ static void save_state_to_tss32(struct kvm_vcpu *vcpu, tss-fs = get_segment_selector(vcpu, VCPU_SREG_FS); tss-gs = get_segment_selector(vcpu, VCPU_SREG_GS); tss-ldt_selector = get_segment_selector(vcpu, VCPU_SREG_LDTR); - tss-prev_task_link = get_segment_selector(vcpu, VCPU_SREG_TR); + tss-prev_task_link = 0; } static int load_state_from_tss32(struct kvm_vcpu *vcpu, @@ -3810,6 +3810,7 @@ out: static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector, u32 old_tss_base, + u16 old_tss_selector, int reason, struct desc_struct *nseg_desc) { struct tss_segment_32 tss_segment_32; @@ -3829,6 +3830,18 @@ static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector, tss_segment_32, sizeof tss_segment_32)) goto out; + /* +* SDM 3: table 6-2 +* Task linking required for CALL and GATE. +*/ + if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE) + { + tss_segment_32.prev_task_link = old_tss_selector; + kvm_write_guest(vcpu-kvm, get_tss_base_addr(vcpu, nseg_desc), + tss_segment_32, sizeof(struct tss_segment_32)); + + } + if (load_state_from_tss32(vcpu, tss_segment_32)) goto out; @@ -3882,10 +3895,12 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason) kvm_x86_ops-set_rflags(vcpu, eflags ~X86_EFLAGS_NT); } - kvm_x86_ops-skip_emulated_instruction(vcpu); + if (reason != TASK_SWITCH_GATE) + kvm_x86_ops-skip_emulated_instruction(vcpu); if (nseg_desc.type 8) ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_base, +old_tss_sel, reason, nseg_desc); else ret = kvm_task_switch_16(vcpu, tss_selector, old_tss_base, -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
missing kvm smp tlb flush in invlpg
From: Andrea Arcangeli aarca...@redhat.com While looking at invlpg out of sync code with Izik I think I noticed a missing smp tlb flush here. Without this the other cpu can still write to a freed host physical page. tlb smp flush must happen if rmap_remove is called always before mmu_lock is released because the VM will take the mmu_lock before it can finally add the page to the freelist after swapout. mmu notifier makes it safe to flush the tlb after freeing the page (otherwise it would never be safe) so we can do a single flush for multiple sptes invalidated. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index a0c11ea..855eb71 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -445,6 +445,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva) gpa_t pte_gpa = -1; int level; u64 *sptep; + int need_flush = 0; spin_lock(vcpu-kvm-mmu_lock); @@ -464,6 +465,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva) rmap_remove(vcpu-kvm, sptep); if (is_large_pte(*sptep)) --vcpu-kvm-stat.lpages; + need_flush = 1; } set_shadow_pte(sptep, shadow_trap_nonpresent_pte); break; @@ -473,6 +475,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva) break; } + if (need_flush) + kvm_flush_remote_tlbs(vcpu-kvm); spin_unlock(vcpu-kvm-mmu_lock); if (pte_gpa == -1) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: external module: support building against Windriver 2.0 (kernel 2.6.21)
This is needed to compile kvm in a Windriver 2.0 distribution (kernel 2.6.21). This kernel has an include file marker.h, but trace_mark is not defined there. So the compat code in kernel/include-compat/linux/marker.h is not included. Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com --- kernel/external-module-compat-comm.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/kernel/external-module-compat-comm.h b/kernel/external-module-compat-comm.h index a14cea2..e40501e 100644 --- a/kernel/external-module-compat-comm.h +++ b/kernel/external-module-compat-comm.h @@ -25,6 +25,10 @@ # undef CONFIG_KVM_TRACE #endif +#if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,21) +#define trace_mark(args...) ((void)0) +#endif + /* * 2.6.16 does not have GFP_NOWAIT */ -- 1.6.0.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: external module: support building against Windriver 2.0 (kernel 2.6.21)
Bernhard Kohl wrote: This is needed to compile kvm in a Windriver 2.0 distribution (kernel 2.6.21). This kernel has an include file marker.h, but trace_mark is not defined there. So the compat code in kernel/include-compat/linux/marker.h is not included. I bet this is because Wind River patched some variant of LTTng into their kernel. However, I'm unsure if supporting significantly modified distribution kernels is in the scope of this compat layer. If it is ok for the maintainers, you should try to make the test more Wind River specific (did you check that there is no side-effect for normal = 2.6.21 kernels?) and maybe add a comment. Jan Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com --- kernel/external-module-compat-comm.h |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/kernel/external-module-compat-comm.h b/kernel/external-module-compat-comm.h index a14cea2..e40501e 100644 --- a/kernel/external-module-compat-comm.h +++ b/kernel/external-module-compat-comm.h @@ -25,6 +25,10 @@ # undef CONFIG_KVM_TRACE #endif +#if LINUX_VERSION_CODE = KERNEL_VERSION(2,6,21) +#define trace_mark(args...) ((void)0) +#endif + /* * 2.6.16 does not have GFP_NOWAIT */ -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Improvements for task switching
Bernhard Kohl wrote: NSN's proprietary OS DMX sometimes does task switches. To get it running in KVM the following changes were necessary: Interrupt injection only with interrupt flag set. Linking the tss-prev_task_link to itself removed. Task linking is required for CALL and GATE. Do not call skip_emulated_instruction() for GATE. Please post independent changes as separate patches. I guess the task linking changes belong together, but surely not to the IRQ injection patch. And the last change looks independent, too. Another wish (specifically as this is tricky stuff): also describe in the commit log, why you changed something. Signed-off-by: Bernhard Kohl bernhard.k...@nsn.com --- arch/x86/kvm/vmx.c |3 ++- arch/x86/kvm/x86.c | 19 +-- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 5cf28df..eca57a3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) enable_irq_window(vcpu); } if (vcpu-arch.interrupt.pending) { - vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); + if (vcpu-arch.interrupt_window_open) + vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); if (kvm_cpu_has_interrupt(vcpu)) enable_irq_window(vcpu); } That causes concerns on my side as we had a hard time stabilizing this code. Need to think about it. Do you happen to have a test case for this (if it's not publicly shareable, contact me directly)? Did you check that this change causes no obvious regressions to other guests? What about the user-inject IRQ case, does it already work for you as-is? diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b556b6a..9052058 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3683,7 +3683,7 @@ static void save_state_to_tss32(struct kvm_vcpu *vcpu, tss-fs = get_segment_selector(vcpu, VCPU_SREG_FS); tss-gs = get_segment_selector(vcpu, VCPU_SREG_GS); tss-ldt_selector = get_segment_selector(vcpu, VCPU_SREG_LDTR); - tss-prev_task_link = get_segment_selector(vcpu, VCPU_SREG_TR); + tss-prev_task_link = 0; } static int load_state_from_tss32(struct kvm_vcpu *vcpu, @@ -3810,6 +3810,7 @@ out: static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector, u32 old_tss_base, +u16 old_tss_selector, int reason, struct desc_struct *nseg_desc) { struct tss_segment_32 tss_segment_32; What about 16-bit switches, are they already correct? @@ -3829,6 +3830,18 @@ static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector, tss_segment_32, sizeof tss_segment_32)) goto out; + /* + * SDM 3: table 6-2 + * Task linking required for CALL and GATE. + */ + if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE) + { + tss_segment_32.prev_task_link = old_tss_selector; + kvm_write_guest(vcpu-kvm, get_tss_base_addr(vcpu, nseg_desc), + tss_segment_32, sizeof(struct tss_segment_32)); + + } + if (load_state_from_tss32(vcpu, tss_segment_32)) goto out; @@ -3882,10 +3895,12 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason) kvm_x86_ops-set_rflags(vcpu, eflags ~X86_EFLAGS_NT); } - kvm_x86_ops-skip_emulated_instruction(vcpu); + if (reason != TASK_SWITCH_GATE) + kvm_x86_ops-skip_emulated_instruction(vcpu); if (nseg_desc.type 8) ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_base, + old_tss_sel, reason, nseg_desc); else ret = kvm_task_switch_16(vcpu, tss_selector, old_tss_base, Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Improvements for task switching
Jan Kiszka wrote: Bernhard Kohl wrote: diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 5cf28df..eca57a3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3357,7 +3357,8 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) enable_irq_window(vcpu); } if (vcpu-arch.interrupt.pending) { -vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); +if (vcpu-arch.interrupt_window_open) +vmx_inject_irq(vcpu, vcpu-arch.interrupt.nr); if (kvm_cpu_has_interrupt(vcpu)) enable_irq_window(vcpu); } That causes concerns on my side as we had a hard time stabilizing this code. Need to think about it. Do you happen to have a test case for this (if it's not publicly shareable, contact me directly)? Did you check that this change causes no obvious regressions to other guests? What about the user-inject IRQ case, does it already work for you as-is? Hmm, do_interrupt_requests will most likely not cause troubles as it both pends and injects interrupts only when the window if open. I don't get the scenario behind this here yet, but I think it would be a very good chance to align the code layout of vmx_intr_assist to do_interrupt_requests in this respect, either finally de-optimizing or even breaking both :) - or bringing them in the same correct form. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Unable to re-establish VNC connection after some time
Hello fellow kvm users/admins, im currently running kvm-84 with linux-2.6.28.4 under x86_64. My 2 linux guests are basically running the same kernel (minus iscsi, multipath and kvm support. Another guest is a win2008 server. All of the machines experience the same problem: After a while i can't connect via VNC, UltraVNC as well as RealVNC experiencee Timeouts waiting for the server to respond. All machines show: (qemu) info vnc VNC server active on: 0.0.0.0:1 Client connected It looks like something has died there ... could it be because i didnt disconnect my VNC Client properly? Even (qemu) system_reset, will not bring the screen back. I am open for suggestions. -- Regards, Andreas Olsowski mailto:andreas.olsow...@uni-lueneburg.de System- und Netzwerktechnik Sysadmin extraordinaire Leuphana Univerität Lüneburg Scharnhorststraße 1 21335 Lüneburg Tel: 04131 / 677-1309 Mobil: 0175 / 5720275 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: x86: use smp_send_reschedule in kvm_vcpu_kick
We also hacked the source like the patch. But the issue is not caused by it. We are still trying to figure the reason out. Thanks! Xiantao -Original Message- From: Gleb Natapov [mailto:g...@redhat.com] Sent: Thursday, March 12, 2009 7:04 PM To: Zhang, Xiantao Cc: Avi Kivity; Marcelo Tosatti; Ingo Molnar; kvm@vger.kernel.org; Peter Zijlstra Subject: Re: x86: use smp_send_reschedule in kvm_vcpu_kick On Thu, Mar 12, 2009 at 10:31:47AM +0800, Zhang, Xiantao wrote: Avi Kivity wrote: Marcelo Tosatti wrote: OK, reworked patch: - change ia64 in addition to x86 - add comment on smp send reschedule handlers about KVM's usage Untested on IA64. KVM: use smp_send_reschedule in kvm_vcpu_kick KVM uses a function call IPI to cause the exit of a guest running on a physical cpu. For virtual interrupt notification there is no need to wait on IPI receival, or to execute any function. This is exactly what the reschedule IPI does, without the overhead of function IPI. So use it instead of smp_call_function_single in kvm_vcpu_kick. Also change the guest_mode variable to a bit in vcpu-requests, and use that to collapse multiple IPI's that would be issued between the first one and zeroing of guest mode. This allows kvm_vcpu_kick to called with interrupts disabled. Looks good. Will wait for Xiantao's test-n-ack before applying. kvm-ia64 is broken due to recent check-ins about irq-bits, and we are trying to fix it. For this patch, ia64 has to export the symbol smp_send_reschedule before applying the patch. Can you try this patch please: http://patchwork.kernel.org/patch/11103/ -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html