Re: [PATCH v4 3/3] Inter-VM shared memory PCI device
On 04/15/2010 02:30 AM, Cam Macdonell wrote: Sample programs, init scripts and the shared memory server are available in a git repo here: www.gitorious.org/nahanni Please consider qemu.git/contrib. Should the compilation be tied into Qemu's regular build with a switch (e.g. --enable-ivshmem-server)? Or should it be its own separate build? It can have its own makefile. --- Makefile.target |3 + hw/ivshmem.c| 700 +++ qemu-char.c |6 + qemu-char.h |3 + qemu-doc.texi | 45 + Seems to be light on qdev devices. I notice there is a section named "Data Type Index" that "could be used for qdev device names and options", but is currently empty. Should I place documentation there of device there or just add it to "3.3 Invocation"? I think those are in qemu-options.hx. Just put it somewhere where it seems appropriate. 4 files changed, 712 insertions(+), 0 deletions(-) create mode 100644 hw/ivshmem.c diff --git a/Makefile.target b/Makefile.target index 1ffd802..bc9a681 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o obj-y += rtl8139.o obj-y += e1000.o +# Inter-VM PCI shared memory +obj-y += ivshmem.o + depends on CONFIG_PCI as in obj-($CONFIG_PCI) += ivshmem.o the variable CONFIG_PCI doesn't seem to be set during configuration. I don't see any other PCI devices that depend on it. My mistake, keep as is. Do we also want to depend on CONFIG_KVM? No real need. +static void create_shared_memory_BAR(IVShmemState *s, int fd) { + +s->shm_fd = fd; + +s->ivshmem_offset = qemu_ram_mmap(s->shm_fd, s->ivshmem_size, + MAP_SHARED, 0); Where did the offset go? 0 is the offset. I include the offset parameter in qemu_ram_mmap() to make it flexible for other uses. Makes sense. Are you suggesting to take an optional offset as an argument to -device? No need. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 3/3] Inter-VM shared memory PCI device
On Mon, Apr 12, 2010 at 2:56 PM, Avi Kivity wrote: > On 04/08/2010 01:52 AM, Cam Macdonell wrote: >> >> Support an inter-vm shared memory device that maps a shared-memory object >> as a >> PCI device in the guest. This patch also supports interrupts between >> guest by >> communicating over a unix domain socket. This patch applies to the >> qemu-kvm >> repository. >> >> -device ivshmem,size=[,shm=] >> > > Can that be (2M, 4G, 19T, ...). > >> Interrupts are supported between multiple VMs by using a shared memory >> server >> by using a chardev socket. >> >> -device ivshmem,size=[,shm=> name>][,chardev=][,msi=on] >> [,irqfd=on][,vectors=n] >> -chardev socket,path=,id= >> > > Do we need the irqfd parameter? Should be on by default. > > On the other hand, it may fail with older kernels with limited irqfd slots, > so better keep it there. > >> Sample programs, init scripts and the shared memory server are available >> in a >> git repo here: >> >> www.gitorious.org/nahanni >> > > Please consider qemu.git/contrib. Should the compilation be tied into Qemu's regular build with a switch (e.g. --enable-ivshmem-server)? Or should it be its own separate build? Cam > >> --- >> Makefile.target | 3 + >> hw/ivshmem.c | 700 >> +++ >> qemu-char.c | 6 + >> qemu-char.h | 3 + >> > > qemu-doc.texi | 45 + Seems to be light on qdev devices. I notice there is a section named "Data Type Index" that "could be used for qdev device names and options", but is currently empty. Should I place documentation there of device there or just add it to "3.3 Invocation"? > >> 4 files changed, 712 insertions(+), 0 deletions(-) >> create mode 100644 hw/ivshmem.c >> >> diff --git a/Makefile.target b/Makefile.target >> index 1ffd802..bc9a681 100644 >> --- a/Makefile.target >> +++ b/Makefile.target >> @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o >> obj-y += rtl8139.o >> obj-y += e1000.o >> >> +# Inter-VM PCI shared memory >> +obj-y += ivshmem.o >> + >> > > depends on CONFIG_PCI as in obj-($CONFIG_PCI) += ivshmem.o the variable CONFIG_PCI doesn't seem to be set during configuration. I don't see any other PCI devices that depend on it. Do we also want to depend on CONFIG_KVM? >> +static void create_shared_memory_BAR(IVShmemState *s, int fd) { >> + >> + s->shm_fd = fd; >> + >> + s->ivshmem_offset = qemu_ram_mmap(s->shm_fd, s->ivshmem_size, >> + MAP_SHARED, 0); >> > > Where did the offset go? 0 is the offset. I include the offset parameter in qemu_ram_mmap() to make it flexible for other uses. Are you suggesting to take an optional offset as an argument to -device? Cam -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 3/3] Inter-VM shared memory PCI device
On 04/08/2010 01:52 AM, Cam Macdonell wrote: Support an inter-vm shared memory device that maps a shared-memory object as a PCI device in the guest. This patch also supports interrupts between guest by communicating over a unix domain socket. This patch applies to the qemu-kvm repository. -device ivshmem,size=[,shm=] Can that be (2M, 4G, 19T, ...). Interrupts are supported between multiple VMs by using a shared memory server by using a chardev socket. -device ivshmem,size=[,shm=][,chardev=][,msi=on] [,irqfd=on][,vectors=n] -chardev socket,path=,id= Do we need the irqfd parameter? Should be on by default. On the other hand, it may fail with older kernels with limited irqfd slots, so better keep it there. Sample programs, init scripts and the shared memory server are available in a git repo here: www.gitorious.org/nahanni Please consider qemu.git/contrib. --- Makefile.target |3 + hw/ivshmem.c| 700 +++ qemu-char.c |6 + qemu-char.h |3 + qemu-doc.texi | 45 + 4 files changed, 712 insertions(+), 0 deletions(-) create mode 100644 hw/ivshmem.c diff --git a/Makefile.target b/Makefile.target index 1ffd802..bc9a681 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o obj-y += rtl8139.o obj-y += e1000.o +# Inter-VM PCI shared memory +obj-y += ivshmem.o + depends on CONFIG_PCI # Hardware support + +#define PCI_COMMAND_IOACCESS0x0001 +#define PCI_COMMAND_MEMACCESS 0x0002 Should be in pci.h? + +#define DEBUG_IVSHMEM Disable for production. + +#define IVSHMEM_IRQFD 0 +#define IVSHMEM_MSI 1 +#define IVSHMEM_MAX_EVENTFDS 16 Too low? why limit? + +struct eventfd_entry { +PCIDevice *pdev; +int vector; +}; Coding style: EventfdEntry, and a typedef. + +typedef struct IVShmemState { +PCIDevice dev; +uint32_t intrmask; +uint32_t intrstatus; +uint32_t doorbell; + +CharDriverState * chr; +CharDriverState ** eventfd_chr; +int ivshmem_mmio_io_addr; + +pcibus_t mmio_addr; +uint8_t *ivshmem_ptr; +unsigned long ivshmem_offset; off_t +unsigned int ivshmem_size; ram_addr_t + +/* accessing registers - based on rtl8139 */ +static void ivshmem_update_irq(IVShmemState *s, int val) +{ +int isr; +isr = (s->intrstatus& s->intrmask)& 0x; This is highly undocumented, but fits my suggested 'status is bitmap'. 'isr' needs to be uint32_t if you mask it like that. + +/* don't print ISR resets */ +if (isr) { +IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n", + isr ? 1 : 0, s->intrstatus, s->intrmask); +} + +qemu_set_irq(s->dev.irq[0], (isr != 0)); +} + + + +static void create_shared_memory_BAR(IVShmemState *s, int fd) { + +s->shm_fd = fd; + +s->ivshmem_offset = qemu_ram_mmap(s->shm_fd, s->ivshmem_size, + MAP_SHARED, 0); Where did the offset go? + +s->ivshmem_ptr = qemu_get_ram_ptr(s->ivshmem_offset); Never used, please drop. + +/* region for shared memory */ +pci_register_bar(&s->dev, 2, s->ivshmem_size, + PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_map); Might be worthwhile to mark it as a 64-bit BAR. Please test with ridiculous shared memory sizes. +} + +static int ivshmem_irqfd(PCIDevice* pdev, uint16_t vector, int fd) +{ +struct kvm_irqfd call = { }; +int r; + +IVSHMEM_DPRINTF("inside irqfd\n"); +if (vector>= pdev->msix_entries_nr) +return -EINVAL; +call.fd = fd; +call.gsi = pdev->msix_irq_entries[vector].gsi; +r = kvm_vm_ioctl(kvm_state, KVM_IRQFD,&call); +if (r< 0) { +IVSHMEM_DPRINTF("allocating irqfd failed %d\n", r); +return r; +} +return 0; +} should be in kvm.c for reuse. + +static int ivshmem_ioeventfd(IVShmemState* s, int posn, int fd, int vector) +{ + +int ret; +struct kvm_ioeventfd iofd; + +iofd.datamatch = (posn<< 16) | vector; +iofd.addr = s->mmio_addr + Doorbell; +iofd.len = 4; +iofd.flags = KVM_IOEVENTFD_FLAG_DATAMATCH; +iofd.fd = fd; + +ret = kvm_vm_ioctl(kvm_state, KVM_IOEVENTFD,&iofd); Ditto. + +if (ret< 0) { +fprintf(stderr, "error assigning ioeventfd (%d)\n", ret); +perror(strerror(ret)); +} else { +IVSHMEM_DPRINTF("success assigning ioeventfd (%d:%d)\n", posn, vector); +} + +return ret; +} blank line here. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 3/3] Inter-VM shared memory PCI device
Support an inter-vm shared memory device that maps a shared-memory object as a PCI device in the guest. This patch also supports interrupts between guest by communicating over a unix domain socket. This patch applies to the qemu-kvm repository. -device ivshmem,size=[,shm=] Interrupts are supported between multiple VMs by using a shared memory server by using a chardev socket. -device ivshmem,size=[,shm=][,chardev=][,msi=on] [,irqfd=on][,vectors=n] -chardev socket,path=,id= Sample programs, init scripts and the shared memory server are available in a git repo here: www.gitorious.org/nahanni --- Makefile.target |3 + hw/ivshmem.c| 700 +++ qemu-char.c |6 + qemu-char.h |3 + 4 files changed, 712 insertions(+), 0 deletions(-) create mode 100644 hw/ivshmem.c diff --git a/Makefile.target b/Makefile.target index 1ffd802..bc9a681 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,9 @@ obj-$(CONFIG_USB_OHCI) += usb-ohci.o obj-y += rtl8139.o obj-y += e1000.o +# Inter-VM PCI shared memory +obj-y += ivshmem.o + # Hardware support obj-i386-y = pckbd.o dma.o obj-i386-y += vga.o diff --git a/hw/ivshmem.c b/hw/ivshmem.c new file mode 100644 index 000..2ec6c2c --- /dev/null +++ b/hw/ivshmem.c @@ -0,0 +1,700 @@ +/* + * Inter-VM Shared Memory PCI device. + * + * Author: + * Cam Macdonell + * + * Based On: cirrus_vga.c and rtl8139.c + * + * This code is licensed under the GNU GPL v2. + */ +#include +#include +#include +#include +#include +#include +#include "hw.h" +#include "console.h" +#include "pc.h" +#include "pci.h" +#include "sysemu.h" + +#include "msix.h" +#include "qemu-kvm.h" +#include "libkvm.h" + +#include +#include +#include +#include + +#define PCI_COMMAND_IOACCESS0x0001 +#define PCI_COMMAND_MEMACCESS 0x0002 + +#define DEBUG_IVSHMEM + +#define IVSHMEM_IRQFD 0 +#define IVSHMEM_MSI 1 +#define IVSHMEM_MAX_EVENTFDS 16 + +#ifdef DEBUG_IVSHMEM +#define IVSHMEM_DPRINTF(fmt, args...)\ +do {printf("IVSHMEM: " fmt, ##args); } while (0) +#else +#define IVSHMEM_DPRINTF(fmt, args...) +#endif + +#define NEW_GUEST_VAL UINT_MAX + +struct eventfd_entry { +PCIDevice *pdev; +int vector; +}; + +typedef struct IVShmemState { +PCIDevice dev; +uint32_t intrmask; +uint32_t intrstatus; +uint32_t doorbell; + +CharDriverState * chr; +CharDriverState ** eventfd_chr; +int ivshmem_mmio_io_addr; + +pcibus_t mmio_addr; +uint8_t *ivshmem_ptr; +unsigned long ivshmem_offset; +unsigned int ivshmem_size; +int shm_fd; /* shared memory file descriptor */ + +/* array of eventfds for each guest */ +int * eventfds[IVSHMEM_MAX_EVENTFDS]; +/* keep track of # of eventfds for each guest*/ +int * eventfds_posn_count; + +int vm_id; +int num_eventfds; +uint32_t vectors; +uint32_t features; +struct eventfd_entry eventfd_table[IVSHMEM_MAX_EVENTFDS]; + +char * shmobj; +uint32_t size; /*size of shared memory in MB*/ +} IVShmemState; + +/* registers for the Inter-VM shared memory device */ +enum ivshmem_registers { +IntrMask = 0, +IntrStatus = 4, +IVPosition = 8, +Doorbell = 12, +}; + +static inline uint32_t ivshmem_has_feature(IVShmemState *ivs, int feature) { +return (ivs->features & (1 << feature)); +} + +static inline int is_power_of_two(int x) { +return (x & (x-1)) == 0; +} + +static void ivshmem_map(PCIDevice *pci_dev, int region_num, +pcibus_t addr, pcibus_t size, int type) +{ +IVShmemState *s = DO_UPCAST(IVShmemState, dev, pci_dev); + +IVSHMEM_DPRINTF("addr = %u size = %u\n", (uint32_t)addr, (uint32_t)size); +cpu_register_physical_memory(addr, s->ivshmem_size, s->ivshmem_offset); + +} + +/* accessing registers - based on rtl8139 */ +static void ivshmem_update_irq(IVShmemState *s, int val) +{ +int isr; +isr = (s->intrstatus & s->intrmask) & 0x; + +/* don't print ISR resets */ +if (isr) { +IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n", + isr ? 1 : 0, s->intrstatus, s->intrmask); +} + +qemu_set_irq(s->dev.irq[0], (isr != 0)); +} + +static void ivshmem_IntrMask_write(IVShmemState *s, uint32_t val) +{ +IVSHMEM_DPRINTF("IntrMask write(w) val = 0x%04x\n", val); + +s->intrmask = val; + +ivshmem_update_irq(s, val); +} + +static uint32_t ivshmem_IntrMask_read(IVShmemState *s) +{ +uint32_t ret = s->intrmask; + +IVSHMEM_DPRINTF("intrmask read(w) val = 0x%04x\n", ret); + +return ret; +} + +static void ivshmem_IntrStatus_write(IVShmemState *s, uint32_t val) +{ +IVSHMEM_DPRINTF("IntrStatus write(w) val = 0x%04x\n", val); + +s->intrstatus = val; + +ivshmem_update_irq(s, val); +return; +} + +static uint32_t ivshmem_IntrStatus_read(IVShmemState *s) +{ +uint32_t ret = s->intrstatus; + +/* reading ISR clears all interrupts */