Re: [PATCH 5/6] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
On Fri, Oct 24, 2008 at 11:22:48AM -0500, Anthony Liguori wrote: Amit Shah wrote: +#include linux/kvm_para.h Is this header really necessary? No, removed. +#include device-assignment.h + +/* From linux/ioport.h */ +#define IORESOURCE_IO 0x0100 /* Resource type */ +#define IORESOURCE_MEM 0x0200 +#define IORESOURCE_IRQ 0x0400 +#define IORESOURCE_DMA 0x0800 +#define IORESOURCE_PREFETCH 0x1000 /* No side effects */ + +/* #define DEVICE_ASSIGNMENT_DEBUG 1 */ + +#ifdef DEVICE_ASSIGNMENT_DEBUG +#define DEBUG(fmt, args...) \ Please use C99 style varidacs. Done. +do { \ + fprintf(stderr, %s: fmt, __func__ , ## args);\ +} while (0) +#else +#define DEBUG(fmt, args...) do { } while(0) +#endif + +static void assigned_dev_ioport_writeb(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; Cast is unnecessary. Removed. +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); It would be nice to make this a function to make it more obvious that you were translated from guest to host regions. The cast to unsigned long should probably be target_ulong too. Done. +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, + __func__, r_pio, (int)r_access-e_physbase, + (unsigned long)r_access-r_virtbase, value); This debug statement looks wrong to me. You're passing stderr. It's true for all of these functions. Fixed. +static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, + uint32_t e_phys, uint32_t e_size, int type) +{ +AssignedDevice *r_dev = (AssignedDevice *) pci_dev; +AssignedDevRegion *region = r_dev-v_addrs[region_num]; +int first_map = (region-e_size == 0); +int ret = 0; + +DEBUG(%s: e_phys=%08x r_virt=%x type=%d len=%08x region_num=%d \n, + __func__, e_phys, (uint32_t)region-r_virtbase, type, e_size, + region_num); You already have __func__ in your debug printf(). Fixed. +region-e_physbase = e_phys; +region-e_size = e_size; + +/* FIXME: Add support for emulated MMIO for non-kvm guests */ +if (kvm_enabled()) { I don't think having a kvm_enabled() check here is very useful. I think device-assignment.c should be conditional on USE_KVM, and the only kvm_enabled() check should be when creating the initial device assignment. Practically speaking, QEMU is never going to support device assignment outside of the context of KVM because I strongly doubt anything like irqhook will make it upstream. Reworked along your suggestions, please let me know if you have further comments. +if (!first_map) +kvm_destroy_phys_mem(kvm_context, e_phys, e_size); +if (e_size 0) +ret = kvm_register_phys_mem(kvm_context, e_phys, +region-r_virtbase, e_size, 0); +if (ret != 0) +fprintf(stderr, %s: Error: create new mapping failed\n, __func__); If we do get an error here, we shouldn't keep going. This error is probably going to happen in practice if a guest tries to pass through too many devices and we run out of slots. Fixed, we exit(1) now (is there a more graceful to bail out?). +} +} + +static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num, +uint32_t addr, uint32_t size, int type) +{ +AssignedDevice *r_dev = (AssignedDevice *) pci_dev; +AssignedDevRegion *region = r_dev-v_addrs[region_num]; +int r; + +region-e_physbase = addr; +region-e_size = size; + +DEBUG(%s: e_phys=0x%x r_virt=%x type=0x%x len=%d region_num=%d \n, + __func__, addr, (uint32_t)region-r_virtbase, type, size, region_num); Need to fix this DEBUG(). Fixed. +r = ioperm((uint32_t)region-r_virtbase, size, 1); I don't think this is enough for KVM. This will only do the ioperm in the thread that triggered the IO. If you have an SMP guest, ioperm needs to be done on each VCPU's thread. Fixed. +if (r 0) { +perror(assigned_dev_ioport_map: ioperm); +return; +} Again, if we fail, we have to exit QEMU gracefully. Fixed. +register_ioport_read(addr, size, 1, assigned_dev_ioport_readb, + (void *) (r_dev-v_addrs + region_num)); +register_ioport_read(addr, size, 2, assigned_dev_ioport_readw, + (void *) (r_dev-v_addrs + region_num)); +register_ioport_read(addr, size, 4, assigned_dev_ioport_readl, + (void *) (r_dev-v_addrs + region_num)); +
Re: [PATCH 5/6] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
On Mon, Oct 27, 2008 at 02:32:48PM +0800, Han, Weidong wrote: Yes, it's buggy. It should like: uint32_t old_ephys = region-e_physbase; uint32_t old_esize = region-e_size; ... kvm_destroy_phys_mem(kvm_context, old_ephys, old_esize); Fixed in v8. Thanks! Cheers, Muli -- The First Workshop on I/O Virtualization (WIOV '08) Dec 2008, San Diego, CA, http://www.usenix.org/wiov08/ - SYSTOR 2009---The Israeli Experimental Systems Conference http://www.haifa.il.ibm.com/conferences/systor2009/ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 5/6] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
Su, Disheng wrote: Amit Shah wrote: This patch has been contributed to by the following people: From: Or Sagi [EMAIL PROTECTED] From: Nir Peleg [EMAIL PROTECTED] From: Amit Shah [EMAIL PROTECTED] From: Ben-Ami Yassour [EMAIL PROTECTED] From: Weidong Han [EMAIL PROTECTED] From: Glauber de Oliveira Costa [EMAIL PROTECTED] With this patch, we can assign a device on the host machine to a guest. A new command-line option, -pcidevice is added. To invoke it for a device sitting at PCI bus:dev.fn 04:08.0, use this: -pcidevice host=04:08.0 * The host driver for the device, if any, is to be removed before assigning the device (else device assignment will fail). * A device that shares IRQ with another host device cannot currently be assigned. * The RAW_IO capability is needed for this to work This works only with the in-kernel irqchip method; to use the userspace irqchip, a kernel module (irqhook) and some extra changes are needed. Signed-off-by: Amit Shah [EMAIL PROTECTED] --- qemu/Makefile.target|1 + qemu/hw/device-assignment.c | 619 +++ qemu/hw/device-assignment.h | 98 +++ qemu/hw/pc.c |6 + qemu/hw/pci.c |7 + qemu/vl.c | 18 ++ 6 files changed, 749 insertions(+), 0 deletions(-) create mode 100644 qemu/hw/device-assignment.c create mode 100644 qemu/hw/device-assignment.h diff --git a/qemu/Makefile.target b/qemu/Makefile.target index d9bdeca..05a1d84 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -621,6 +621,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o +OBJS+= device-assignment.o ifeq ($(USE_KVM_PIT), 1) OBJS+= i8254-kvm.o endif diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c new file mode 100644 index 000..5ba21a0 --- /dev/null +++ b/qemu/hw/device-assignment.c @@ -0,0 +1,619 @@ +/* + * Copyright (c) 2007, Neocleus Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + * + * Assign a PCI device from the host to a guest VM. + * + * Adapted for KVM by Qumranet. + * + * Copyright (c) 2007, Neocleus, Alex Novik ([EMAIL PROTECTED]) + * Copyright (c) 2007, Neocleus, Guy Zana ([EMAIL PROTECTED]) + * Copyright (C) 2008, Qumranet, Amit Shah ([EMAIL PROTECTED]) + * Copyright (C) 2008, Red Hat, Amit Shah ([EMAIL PROTECTED]) + */ +#include stdio.h +#include sys/io.h +#include qemu-kvm.h +#include hw.h +#include pc.h +#include sysemu.h +#include console.h +#include linux/kvm_para.h +#include device-assignment.h + +/* From linux/ioport.h */ +#define IORESOURCE_IO 0x0100 /* Resource type */ +#define IORESOURCE_MEM 0x0200 +#define IORESOURCE_IRQ 0x0400 +#define IORESOURCE_DMA 0x0800 +#define IORESOURCE_PREFETCH 0x1000 /* No side effects */ + +/* #define DEVICE_ASSIGNMENT_DEBUG 1 */ + +#ifdef DEVICE_ASSIGNMENT_DEBUG +#define DEBUG(fmt, args...) \ +do { \ + fprintf(stderr, %s: fmt, __func__ , ## args);\ +} while (0) +#else +#define DEBUG(fmt, args...) do { } while(0) +#endif + +static void assigned_dev_ioport_writeb(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); + +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, + __func__, r_pio, (int)r_access-e_physbase, + (unsigned long)r_access-r_virtbase, value); + outb(value, r_pio); +} + +static void assigned_dev_ioport_writew(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); + +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, +
RE: [PATCH 5/6] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
Amit Shah wrote: This patch has been contributed to by the following people: From: Or Sagi [EMAIL PROTECTED] From: Nir Peleg [EMAIL PROTECTED] From: Amit Shah [EMAIL PROTECTED] From: Ben-Ami Yassour [EMAIL PROTECTED] From: Weidong Han [EMAIL PROTECTED] From: Glauber de Oliveira Costa [EMAIL PROTECTED] With this patch, we can assign a device on the host machine to a guest. A new command-line option, -pcidevice is added. To invoke it for a device sitting at PCI bus:dev.fn 04:08.0, use this: -pcidevice host=04:08.0 * The host driver for the device, if any, is to be removed before assigning the device (else device assignment will fail). * A device that shares IRQ with another host device cannot currently be assigned. * The RAW_IO capability is needed for this to work This works only with the in-kernel irqchip method; to use the userspace irqchip, a kernel module (irqhook) and some extra changes are needed. Signed-off-by: Amit Shah [EMAIL PROTECTED] --- qemu/Makefile.target|1 + qemu/hw/device-assignment.c | 619 +++ qemu/hw/device-assignment.h | 98 +++ qemu/hw/pc.c |6 + qemu/hw/pci.c |7 + qemu/vl.c | 18 ++ 6 files changed, 749 insertions(+), 0 deletions(-) create mode 100644 qemu/hw/device-assignment.c create mode 100644 qemu/hw/device-assignment.h diff --git a/qemu/Makefile.target b/qemu/Makefile.target index d9bdeca..05a1d84 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -621,6 +621,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o +OBJS+= device-assignment.o ifeq ($(USE_KVM_PIT), 1) OBJS+= i8254-kvm.o endif diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c new file mode 100644 index 000..5ba21a0 --- /dev/null +++ b/qemu/hw/device-assignment.c @@ -0,0 +1,619 @@ +/* + * Copyright (c) 2007, Neocleus Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + * + * Assign a PCI device from the host to a guest VM. + * + * Adapted for KVM by Qumranet. + * + * Copyright (c) 2007, Neocleus, Alex Novik ([EMAIL PROTECTED]) + * Copyright (c) 2007, Neocleus, Guy Zana ([EMAIL PROTECTED]) + * Copyright (C) 2008, Qumranet, Amit Shah ([EMAIL PROTECTED]) + * Copyright (C) 2008, Red Hat, Amit Shah ([EMAIL PROTECTED]) + */ +#include stdio.h +#include sys/io.h +#include qemu-kvm.h +#include hw.h +#include pc.h +#include sysemu.h +#include console.h +#include linux/kvm_para.h +#include device-assignment.h + +/* From linux/ioport.h */ +#define IORESOURCE_IO 0x0100 /* Resource type */ +#define IORESOURCE_MEM 0x0200 +#define IORESOURCE_IRQ 0x0400 +#define IORESOURCE_DMA 0x0800 +#define IORESOURCE_PREFETCH 0x1000 /* No side effects */ + +/* #define DEVICE_ASSIGNMENT_DEBUG 1 */ + +#ifdef DEVICE_ASSIGNMENT_DEBUG +#define DEBUG(fmt, args...) \ +do { \ + fprintf(stderr, %s: fmt, __func__ , ## args);\ +} while (0) +#else +#define DEBUG(fmt, args...) do { } while(0) +#endif + +static void assigned_dev_ioport_writeb(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); + +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, + __func__, r_pio, (int)r_access-e_physbase, + (unsigned long)r_access-r_virtbase, value); +outb(value, r_pio); +} + +static void assigned_dev_ioport_writew(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); + +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, + __func__,
Re: [PATCH 5/6] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
Amit Shah wrote: This patch has been contributed to by the following people: From: Or Sagi [EMAIL PROTECTED] From: Nir Peleg [EMAIL PROTECTED] From: Amit Shah [EMAIL PROTECTED] From: Ben-Ami Yassour [EMAIL PROTECTED] From: Weidong Han [EMAIL PROTECTED] From: Glauber de Oliveira Costa [EMAIL PROTECTED] With this patch, we can assign a device on the host machine to a guest. A new command-line option, -pcidevice is added. To invoke it for a device sitting at PCI bus:dev.fn 04:08.0, use this: -pcidevice host=04:08.0 * The host driver for the device, if any, is to be removed before assigning the device (else device assignment will fail). * A device that shares IRQ with another host device cannot currently be assigned. * The RAW_IO capability is needed for this to work This works only with the in-kernel irqchip method; to use the userspace irqchip, a kernel module (irqhook) and some extra changes are needed. Signed-off-by: Amit Shah [EMAIL PROTECTED] --- qemu/Makefile.target|1 + qemu/hw/device-assignment.c | 619 +++ qemu/hw/device-assignment.h | 98 +++ qemu/hw/pc.c|6 + qemu/hw/pci.c |7 + qemu/vl.c | 18 ++ 6 files changed, 749 insertions(+), 0 deletions(-) create mode 100644 qemu/hw/device-assignment.c create mode 100644 qemu/hw/device-assignment.h diff --git a/qemu/Makefile.target b/qemu/Makefile.target index d9bdeca..05a1d84 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -621,6 +621,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o +OBJS+= device-assignment.o ifeq ($(USE_KVM_PIT), 1) OBJS+= i8254-kvm.o endif diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c new file mode 100644 index 000..5ba21a0 --- /dev/null +++ b/qemu/hw/device-assignment.c @@ -0,0 +1,619 @@ +/* + * Copyright (c) 2007, Neocleus Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + * + * Assign a PCI device from the host to a guest VM. + * + * Adapted for KVM by Qumranet. + * + * Copyright (c) 2007, Neocleus, Alex Novik ([EMAIL PROTECTED]) + * Copyright (c) 2007, Neocleus, Guy Zana ([EMAIL PROTECTED]) + * Copyright (C) 2008, Qumranet, Amit Shah ([EMAIL PROTECTED]) + * Copyright (C) 2008, Red Hat, Amit Shah ([EMAIL PROTECTED]) + */ +#include stdio.h +#include sys/io.h +#include qemu-kvm.h +#include hw.h +#include pc.h +#include sysemu.h +#include console.h +#include linux/kvm_para.h Is this header really necessary? +#include device-assignment.h + +/* From linux/ioport.h */ +#define IORESOURCE_IO 0x0100 /* Resource type */ +#define IORESOURCE_MEM 0x0200 +#define IORESOURCE_IRQ 0x0400 +#define IORESOURCE_DMA 0x0800 +#define IORESOURCE_PREFETCH 0x1000 /* No side effects */ + +/* #define DEVICE_ASSIGNMENT_DEBUG 1 */ + +#ifdef DEVICE_ASSIGNMENT_DEBUG +#define DEBUG(fmt, args...) \ Please use C99 style varidacs. +do { \ + fprintf(stderr, %s: fmt, __func__ , ## args);\ +} while (0) +#else +#define DEBUG(fmt, args...) do { } while(0) +#endif + +static void assigned_dev_ioport_writeb(void *opaque, uint32_t addr, + uint32_t value) +{ +AssignedDevRegion *r_access = (AssignedDevRegion *)opaque; Cast is unnecessary. +uint32_t r_pio = (unsigned long)r_access-r_virtbase ++ (addr - r_access-e_physbase); It would be nice to make this a function to make it more obvious that you were translated from guest to host regions. The cast to unsigned long should probably be target_ulong too. +DEBUG(stderr, %s: r_pio=%08x e_physbase=%08x + r_virtbase=%08lx value=%08x\n, + __func__, r_pio, (int)r_access-e_physbase, + (unsigned long)r_access-r_virtbase, value); This debug statement looks wrong to me. You're passing stderr. It's true for all of these functions. +static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num, + uint32_t e_phys, uint32_t e_size,