Hi

On Mon, Feb 29, 2016 at 7:40 PM, Markus Armbruster <arm...@redhat.com> wrote:
> ivshmem can be configured with and without interrupt capability
> (a.k.a. "doorbell").  The two configurations have largely disjoint
> options, which makes for a confusing (and badly checked) user
> interface.  Moreover, the device can't tell the guest whether its
> doorbell is enabled.
>
> Create two new device models ivshmem-plain and ivshmem-doorbell, and
> deprecate the old one.
>
> Changes from ivshmem:
>
> * PCI revision is 1 instead of 0.  The new revision is fully backwards
>   compatible for guests.  Guests may elect to require at least
>   revision 1 to make sure they're not exposed to the funny "no shared
>   memory, yet" state.
>
> * Property "role" replaced by "master".  role=master becomes
>   master=on, role=peer becomes master=off.  Default is off instead of
>   auto.
>
> * Property "use64" is gone.  The new devices always have 64 bit BARs.
>
> Changes from ivshmem to ivshmem-plain:
>
> * The Interrupt Pin register in PCI config space is zero (does not use
>   an interrupt pin) instead of one (uses INTA).
>
> * Property "x-memdev" is renamed to "memdev".
>
> * Properties "shm" and "size" are gone.  Use property "memdev"
>   instead.
>
> * Property "msi" is gone.  The new device can't have MSI-X capability.
>   It can't interrupt anyway.
>
> * Properties "ioeventfd" and "vectors" are gone.  They're meaningless
>   without interrupts anyway.
>
> Changes from ivshmem to ivshmem-doorbell:
>
> * Property "msi" is gone.  The new device always has MSI-X capability.
>
> * Property "ioeventfd" defaults to on instead of off.
>
> * Property "size" is gone.  The new device can only map all the shared
>   memory received from the server.
>
> Guests can easily find out whether the device is configured for
> interrupts by checking for MSI-X capability.
>
> Note: some code added in sub-optimal places to make the diff easier to
> review.  The next commit will move it to more sensible places.
>
> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> ---

This is a large patch that could perhaps be splitted, but I didn't see
any issue with it so:
Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com>


>  docs/specs/ivshmem-spec.txt |  66 +++++-----
>  hw/misc/ivshmem.c           | 310 
> ++++++++++++++++++++++++++++++++------------
>  qemu-doc.texi               |  33 ++---
>  tests/ivshmem-test.c        |  12 +-
>  4 files changed, 288 insertions(+), 133 deletions(-)
>
> diff --git a/docs/specs/ivshmem-spec.txt b/docs/specs/ivshmem-spec.txt
> index 3eb8c97..047dfc0 100644
> --- a/docs/specs/ivshmem-spec.txt
> +++ b/docs/specs/ivshmem-spec.txt
> @@ -17,9 +17,10 @@ get interrupted by its peers.
>
>  There are two basic configurations:
>
> -- Just shared memory: -device ivshmem,shm=NAME,...
> +- Just shared memory: -device ivshmem-plain,memdev=HMB,...
>
> -  This uses shared memory object NAME.
> +  This uses host memory backend HMB.  It should have option "share"
> +  set.
>
>  - Shared memory plus interrupts: -device ivshmem,chardev=CHR,vectors=N,...
>
> @@ -30,9 +31,8 @@ There are two basic configurations:
>    Each peer gets assigned a unique ID by the server.  IDs must be
>    between 0 and 65535.
>
> -  Interrupts are message-signaled by default (MSI-X).  With msi=off
> -  the device has no MSI-X capability, and uses legacy INTx instead.
> -  vectors=N configures the number of vectors to use.
> +  Interrupts are message-signaled (MSI-X).  vectors=N configures the
> +  number of vectors to use.
>
>  For more details on ivshmem device properties, see The QEMU Emulator
>  User Documentation (qemu-doc.*).
> @@ -40,14 +40,15 @@ User Documentation (qemu-doc.*).
>
>  == The ivshmem PCI device's guest interface ==
>
> -The device has vendor ID 1af4, device ID 1110, revision 0.
> +The device has vendor ID 1af4, device ID 1110, revision 1.  Before
> +QEMU 2.6.0, it had revision 0.
>
>  === PCI BARs ===
>
>  The ivshmem PCI device has two or three BARs:
>
>  - BAR0 holds device registers (256 Byte MMIO)
> -- BAR1 holds MSI-X table and PBA (only when using MSI-X)
> +- BAR1 holds MSI-X table and PBA (only ivshmem-doorbell)
>  - BAR2 maps the shared memory object
>
>  There are two ways to use this device:
> @@ -58,18 +59,19 @@ There are two ways to use this device:
>    user space (see http://dpdk.org/browse/memnic).
>
>  - If you additionally need the capability for peers to interrupt each
> -  other, you need BAR0 and, if using MSI-X, BAR1.  You will most
> -  likely want to write a kernel driver to handle interrupts.  Requires
> -  the device to be configured for interrupts, obviously.
> +  other, you need BAR0 and BAR1.  You will most likely want to write a
> +  kernel driver to handle interrupts.  Requires the device to be
> +  configured for interrupts, obviously.
>
>  Before QEMU 2.6.0, BAR2 can initially be invalid if the device is
>  configured for interrupts.  It becomes safely accessible only after
> -the ivshmem server provided the shared memory.  Guest software should
> -wait for the IVPosition register (described below) to become
> -non-negative before accessing BAR2.
> +the ivshmem server provided the shared memory.  These devices have PCI
> +revision 0 rather than 1.  Guest software should wait for the
> +IVPosition register (described below) to become non-negative before
> +accessing BAR2.
>
> -The device is not capable to tell guest software whether it is
> -configured for interrupts.
> +Revision 0 of the device is not capable to tell guest software whether
> +it is configured for interrupts.
>
>  === PCI device registers ===
>
> @@ -77,10 +79,12 @@ BAR 0 contains the following registers:
>
>      Offset  Size  Access      On reset  Function
>          0     4   read/write        0   Interrupt Mask
> -                                        bit 0: peer interrupt
> +                                        bit 0: peer interrupt (rev 0)
> +                                               reserved       (rev 1)
>                                          bit 1..31: reserved
>          4     4   read/write        0   Interrupt Status
> -                                        bit 0: peer interrupt
> +                                        bit 0: peer interrupt (rev 0)
> +                                               reserved       (rev 1)
>                                          bit 1..31: reserved
>          8     4   read-only   0 or ID   IVPosition
>         12     4   write-only      N/A   Doorbell
> @@ -92,18 +96,18 @@ Software should only access the registers as specified in 
> column
>  "Access".  Reserved bits should be ignored on read, and preserved on
>  write.
>
> -Interrupt Status and Mask Register together control the legacy INTx
> -interrupt when the device has no MSI-X capability: INTx is asserted
> -when the bit-wise AND of Status and Mask is non-zero and the device
> -has no MSI-X capability.  Interrupt Status Register bit 0 becomes 1
> -when an interrupt request from a peer is received.  Reading the
> -register clears it.
> +In revision 0 of the device, Interrupt Status and Mask Register
> +together control the legacy INTx interrupt when the device has no
> +MSI-X capability: INTx is asserted when the bit-wise AND of Status and
> +Mask is non-zero and the device has no MSI-X capability.  Interrupt
> +Status Register bit 0 becomes 1 when an interrupt request from a peer
> +is received.  Reading the register clears it.
>
>  IVPosition Register: if the device is not configured for interrupts,
>  this is zero.  Else, it is the device's ID (between 0 and 65535).
>
>  Before QEMU 2.6.0, the register may read -1 for a short while after
> -reset.
> +reset.  These devices have PCI revision 0 rather than 1.
>
>  There is no good way for software to find out whether the device is
>  configured for interrupts.  A positive IVPosition means interrupts,
> @@ -124,14 +128,14 @@ interrupt vectors connected, the write is ignored.  The 
> device is not
>  capable to tell guest software what peers are connected, or how many
>  interrupt vectors are connected.
>
> -If the peer doesn't use MSI-X, its Interrupt Status register is set to
> -1.  This asserts INTx unless masked by the Interrupt Mask register.
> -The device is not capable to communicate the interrupt vector to guest
> -software then.
> +The peer's interrupt for this vector then becomes pending.  There is
> +no way for software to clear the pending bit, and a polling mode of
> +operation is therefore impossible.
>
> -If the peer uses MSI-X, the interrupt for this vector becomes pending.
> -There is no way for software to clear the pending bit, and a polling
> -mode of operation is therefore impossible with MSI-X.
> +If the peer is a revision 0 device without MSI-X capability, its
> +Interrupt Status register is set to 1.  This asserts INTx unless
> +masked by the Interrupt Mask register.  The device is not capable to
> +communicate the interrupt vector to guest software then.
>
>  With multiple MSI-X vectors, different vectors can be used to indicate
>  different events have occurred.  The semantics of interrupt vectors
> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
> index b39ea27..f7f5b3b 100644
> --- a/hw/misc/ivshmem.c
> +++ b/hw/misc/ivshmem.c
> @@ -29,6 +29,7 @@
>  #include "qom/object_interfaces.h"
>  #include "sysemu/char.h"
>  #include "sysemu/hostmem.h"
> +#include "sysemu/qtest.h"
>  #include "qapi/visitor.h"
>  #include "exec/ram_addr.h"
>
> @@ -53,6 +54,18 @@
>          }                                               \
>      } while (0)
>
> +#define TYPE_IVSHMEM_COMMON "ivshmem-common"
> +#define IVSHMEM_COMMON(obj) \
> +    OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_COMMON)
> +
> +#define TYPE_IVSHMEM_PLAIN "ivshmem-plain"
> +#define IVSHMEM_PLAIN(obj) \
> +    OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_PLAIN)
> +
> +#define TYPE_IVSHMEM_DOORBELL "ivshmem-doorbell"
> +#define IVSHMEM_DOORBELL(obj) \
> +    OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_DOORBELL)
> +
>  #define TYPE_IVSHMEM "ivshmem"
>  #define IVSHMEM(obj) \
>      OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM)
> @@ -80,8 +93,6 @@ typedef struct IVShmemState {
>      MemoryRegion ivshmem_mmio;
>
>      MemoryRegion *ivshmem_bar2; /* BAR 2 (shared memory) */
> -    size_t ivshmem_size; /* size of shared memory region */
> -    uint32_t ivshmem_64bit;
>
>      Peer *peers;
>      int nb_peers;               /* space in @peers[] */
> @@ -96,9 +107,12 @@ typedef struct IVShmemState {
>      OnOffAuto master;
>      Error *migration_blocker;
>
> -    char * shmobj;
> -    char * sizearg;
> -    char * role;
> +    /* legacy cruft */
> +    char *role;
> +    char *shmobj;
> +    char *sizearg;
> +    size_t legacy_size;
> +    uint32_t not_legacy_32bit;
>  } IVShmemState;
>
>  /* registers for the Inter-VM shared memory device */
> @@ -258,7 +272,7 @@ static void ivshmem_vector_notify(void *opaque)
>  {
>      MSIVector *entry = opaque;
>      PCIDevice *pdev = entry->pdev;
> -    IVShmemState *s = IVSHMEM(pdev);
> +    IVShmemState *s = IVSHMEM_COMMON(pdev);
>      int vector = entry - s->msi_vectors;
>      EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
>
> @@ -279,7 +293,7 @@ static void ivshmem_vector_notify(void *opaque)
>  static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector,
>                                   MSIMessage msg)
>  {
> -    IVShmemState *s = IVSHMEM(dev);
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
>      EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
>      MSIVector *v = &s->msi_vectors[vector];
>      int ret;
> @@ -296,7 +310,7 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned 
> vector,
>
>  static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector)
>  {
> -    IVShmemState *s = IVSHMEM(dev);
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
>      EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
>      int ret;
>
> @@ -313,7 +327,7 @@ static void ivshmem_vector_poll(PCIDevice *dev,
>                                  unsigned int vector_start,
>                                  unsigned int vector_end)
>  {
> -    IVShmemState *s = IVSHMEM(dev);
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
>      unsigned int vector;
>
>      IVSHMEM_DPRINTF("vector poll %p %d-%d\n", dev, vector_start, vector_end);
> @@ -460,6 +474,7 @@ static void setup_interrupt(IVShmemState *s, int vector, 
> Error **errp)
>  static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
>  {
>      struct stat buf;
> +    size_t size;
>      void *ptr;
>
>      if (s->ivshmem_bar2) {
> @@ -475,15 +490,21 @@ static void process_msg_shmem(IVShmemState *s, int fd, 
> Error **errp)
>          return;
>      }
>
> -    if (s->ivshmem_size > buf.st_size) {
> -        error_setg(errp, "server sent only %zd bytes of shared memory",
> -                   (size_t)buf.st_size);
> -        close(fd);
> -        return;
> +    size = buf.st_size;
> +
> +    /* Legacy cruft */
> +    if (s->legacy_size != SIZE_MAX) {
> +        if (size < s->legacy_size) {
> +            error_setg(errp, "server sent only %zd bytes of shared memory",
> +                       (size_t)buf.st_size);
> +            close(fd);
> +            return;
> +        }
> +        size = s->legacy_size;
>      }
>
>      /* mmap the region and map into the BAR2 */
> -    ptr = mmap(0, s->ivshmem_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 
> 0);
> +    ptr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
>      if (ptr == MAP_FAILED) {
>          error_setg_errno(errp, errno, "Failed to mmap shared memory");
>          close(fd);
> @@ -491,7 +512,7 @@ static void process_msg_shmem(IVShmemState *s, int fd, 
> Error **errp)
>      }
>      s->ivshmem_bar2 = g_new(MemoryRegion, 1);
>      memory_region_init_ram_ptr(s->ivshmem_bar2, OBJECT(s),
> -                               "ivshmem.bar2", s->ivshmem_size, ptr);
> +                               "ivshmem.bar2", size, ptr);
>      qemu_set_ram_fd(s->ivshmem_bar2->ram_addr, fd);
>  }
>
> @@ -700,7 +721,7 @@ static void ivshmem_vector_use(IVShmemState *s)
>
>  static void ivshmem_reset(DeviceState *d)
>  {
> -    IVShmemState *s = IVSHMEM(d);
> +    IVShmemState *s = IVSHMEM_COMMON(d);
>
>      s->intrstatus = 0;
>      s->intrmask = 0;
> @@ -776,7 +797,7 @@ static void ivshmem_disable_irqfd(IVShmemState *s)
>  static void ivshmem_write_config(PCIDevice *pdev, uint32_t address,
>                                   uint32_t val, int len)
>  {
> -    IVShmemState *s = IVSHMEM(pdev);
> +    IVShmemState *s = IVSHMEM_COMMON(pdev);
>      int is_enabled, was_enabled = msix_enabled(pdev);
>
>      pci_default_write_config(pdev, address, val, len);
> @@ -818,42 +839,14 @@ static HostMemoryBackend *desugar_shm(const char *shm, 
> size_t size)
>      return MEMORY_BACKEND(obj);
>  }
>
> -static void pci_ivshmem_realize(PCIDevice *dev, Error **errp)
> +static void ivshmem_common_realize(PCIDevice *dev, Error **errp)
>  {
> -    IVShmemState *s = IVSHMEM(dev);
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
>      Error *err = NULL;
>      uint8_t *pci_conf;
>      uint8_t attr = PCI_BASE_ADDRESS_SPACE_MEMORY |
>          PCI_BASE_ADDRESS_MEM_PREFETCH;
>
> -    if (!!s->server_chr + !!s->shmobj + !!s->hostmem != 1) {
> -        error_setg(errp,
> -                   "You must specify either 'shm', 'chardev' or 'x-memdev'");
> -        return;
> -    }
> -
> -    if (s->hostmem) {
> -        MemoryRegion *mr;
> -
> -        if (s->sizearg) {
> -            g_warning("size argument ignored with hostmem");
> -        }
> -
> -        mr = host_memory_backend_get_memory(s->hostmem, &error_abort);
> -        s->ivshmem_size = memory_region_size(mr);
> -    } else if (s->sizearg == NULL) {
> -        s->ivshmem_size = 4 << 20; /* 4 MB default */
> -    } else {
> -        char *end;
> -        int64_t size = qemu_strtosz(s->sizearg, &end);
> -        if (size < 0 || (size_t)size != size || *end != '\0'
> -            || !is_power_of_2(size)) {
> -            error_setg(errp, "Invalid size %s", s->sizearg);
> -            return;
> -        }
> -        s->ivshmem_size = size;
> -    }
> -
>      /* IRQFD requires MSI */
>      if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD) &&
>          !ivshmem_has_feature(s, IVSHMEM_MSI)) {
> @@ -861,29 +854,9 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
> **errp)
>          return;
>      }
>
> -    /* check that role is reasonable */
> -    if (s->role) {
> -        if (strncmp(s->role, "peer", 5) == 0) {
> -            s->master = ON_OFF_AUTO_OFF;
> -        } else if (strncmp(s->role, "master", 7) == 0) {
> -            s->master = ON_OFF_AUTO_ON;
> -        } else {
> -            error_setg(errp, "'role' must be 'peer' or 'master'");
> -            return;
> -        }
> -    } else {
> -        s->master = ON_OFF_AUTO_AUTO;
> -    }
> -
>      pci_conf = dev->config;
>      pci_conf[PCI_COMMAND] = PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
>
> -    /*
> -     * Note: we don't use INTx with IVSHMEM_MSI at all, so this is a
> -     * bald-faced lie then.  But it's a backwards compatible lie.
> -     */
> -    pci_config_set_interrupt_pin(pci_conf, 1);
> -
>      memory_region_init_io(&s->ivshmem_mmio, OBJECT(s), &ivshmem_mmio_ops, s,
>                            "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
>
> @@ -891,14 +864,10 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
> **errp)
>      pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY,
>                       &s->ivshmem_mmio);
>
> -    if (s->ivshmem_64bit) {
> +    if (!s->not_legacy_32bit) {
>          attr |= PCI_BASE_ADDRESS_MEM_TYPE_64;
>      }
>
> -    if (s->shmobj) {
> -        s->hostmem = desugar_shm(s->shmobj, s->ivshmem_size);
> -    }
> -
>      if (s->hostmem != NULL) {
>          IVSHMEM_DPRINTF("using hostmem\n");
>
> @@ -945,9 +914,68 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
> **errp)
>      }
>  }
>
> -static void pci_ivshmem_exit(PCIDevice *dev)
> +static void ivshmem_realize(PCIDevice *dev, Error **errp)
>  {
> -    IVShmemState *s = IVSHMEM(dev);
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
> +
> +    if (!qtest_enabled()) {
> +        error_report("ivshmem is deprecated, please use ivshmem-plain"
> +                     " or ivshmem-doorbell instead");
> +    }
> +
> +    if (!!s->server_chr + !!s->shmobj + !!s->hostmem != 1) {
> +        error_setg(errp,
> +                   "You must specify either 'shm', 'chardev' or 'x-memdev'");
> +        return;
> +    }
> +
> +    if (s->hostmem) {
> +        if (s->sizearg) {
> +            g_warning("size argument ignored with hostmem");
> +        }
> +    } else if (s->sizearg == NULL) {
> +        s->legacy_size = 4 << 20; /* 4 MB default */
> +    } else {
> +        char *end;
> +        int64_t size = qemu_strtosz(s->sizearg, &end);
> +        if (size < 0 || (size_t)size != size || *end != '\0'
> +            || !is_power_of_2(size)) {
> +            error_setg(errp, "Invalid size %s", s->sizearg);
> +            return;
> +        }
> +        s->legacy_size = size;
> +    }
> +
> +    /* check that role is reasonable */
> +    if (s->role) {
> +        if (strncmp(s->role, "peer", 5) == 0) {
> +            s->master = ON_OFF_AUTO_OFF;
> +        } else if (strncmp(s->role, "master", 7) == 0) {
> +            s->master = ON_OFF_AUTO_ON;
> +        } else {
> +            error_setg(errp, "'role' must be 'peer' or 'master'");
> +            return;
> +        }
> +    } else {
> +        s->master = ON_OFF_AUTO_AUTO;
> +    }
> +
> +    if (s->shmobj) {
> +        s->hostmem = desugar_shm(s->shmobj, s->legacy_size);
> +    }
> +
> +    /*
> +     * Note: we don't use INTx with IVSHMEM_MSI at all, so this is a
> +     * bald-faced lie then.  But it's a backwards compatible lie.
> +     */
> +    pci_config_set_interrupt_pin(dev->config, 1);
> +
> +    ivshmem_common_realize(dev, errp);
> +}
> +
> +static void ivshmem_exit(PCIDevice *dev)
> +{
> +    IVShmemState *s = IVSHMEM_COMMON(dev);
>      int i;
>
>      if (s->migration_blocker) {
> @@ -959,7 +987,7 @@ static void pci_ivshmem_exit(PCIDevice *dev)
>          if (!s->hostmem) {
>              void *addr = memory_region_get_ram_ptr(s->ivshmem_bar2);
>
> -            if (munmap(addr, s->ivshmem_size) == -1) {
> +            if (munmap(addr, memory_region_size(s->ivshmem_bar2) == -1)) {
>                  error_report("Failed to munmap shared memory %s",
>                               strerror(errno));
>              }
> @@ -1074,28 +1102,39 @@ static Property ivshmem_properties[] = {
>      DEFINE_PROP_BIT("msi", IVShmemState, features, IVSHMEM_MSI, true),
>      DEFINE_PROP_STRING("shm", IVShmemState, shmobj),
>      DEFINE_PROP_STRING("role", IVShmemState, role),
> -    DEFINE_PROP_UINT32("use64", IVShmemState, ivshmem_64bit, 1),
> +    DEFINE_PROP_UINT32("use64", IVShmemState, not_legacy_32bit, 1),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>
> -static void ivshmem_class_init(ObjectClass *klass, void *data)
> +static void ivshmem_common_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>
> -    k->realize = pci_ivshmem_realize;
> -    k->exit = pci_ivshmem_exit;
> +    k->realize = ivshmem_common_realize;
> +    k->exit = ivshmem_exit;
>      k->config_write = ivshmem_write_config;
>      k->vendor_id = PCI_VENDOR_ID_IVSHMEM;
>      k->device_id = PCI_DEVICE_ID_IVSHMEM;
>      k->class_id = PCI_CLASS_MEMORY_RAM;
> +    k->revision = 1;
>      dc->reset = ivshmem_reset;
> -    dc->props = ivshmem_properties;
> -    dc->vmsd = &ivshmem_vmsd;
>      set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>      dc->desc = "Inter-VM shared memory";
>  }
>
> +static void ivshmem_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->realize = ivshmem_realize;
> +    k->revision = 0;
> +    dc->desc = "Inter-VM shared memory (legacy)";
> +    dc->props = ivshmem_properties;
> +    dc->vmsd = &ivshmem_vmsd;
> +}
> +
>  static void ivshmem_check_memdev_is_busy(Object *obj, const char *name,
>                                           Object *val, Error **errp)
>  {
> @@ -1122,16 +1161,121 @@ static void ivshmem_init(Object *obj)
>                               &error_abort);
>  }
>
> +static const TypeInfo ivshmem_common_info = {
> +    .name          = TYPE_IVSHMEM_COMMON,
> +    .parent        = TYPE_PCI_DEVICE,
> +    .instance_size = sizeof(IVShmemState),
> +    .abstract      = true,
> +    .class_init    = ivshmem_common_class_init,
> +};
> +
>  static const TypeInfo ivshmem_info = {
>      .name          = TYPE_IVSHMEM,
> -    .parent        = TYPE_PCI_DEVICE,
> +    .parent        = TYPE_IVSHMEM_COMMON,
>      .instance_size = sizeof(IVShmemState),
>      .instance_init = ivshmem_init,
>      .class_init    = ivshmem_class_init,
>  };
>
> +static const VMStateDescription ivshmem_plain_vmsd = {
> +    .name = TYPE_IVSHMEM_PLAIN,
> +    .version_id = 0,
> +    .minimum_version_id = 0,
> +    .pre_load = ivshmem_pre_load,
> +    .post_load = ivshmem_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
> +        VMSTATE_UINT32(intrstatus, IVShmemState),
> +        VMSTATE_UINT32(intrmask, IVShmemState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
> +static Property ivshmem_plain_properties[] = {
> +    DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void ivshmem_plain_init(Object *obj)
> +{
> +    IVShmemState *s = IVSHMEM_PLAIN(obj);
> +
> +    object_property_add_link(obj, "memdev", TYPE_MEMORY_BACKEND,
> +                             (Object **)&s->hostmem,
> +                             ivshmem_check_memdev_is_busy,
> +                             OBJ_PROP_LINK_UNREF_ON_RELEASE,
> +                             &error_abort);
> +}
> +
> +static void ivshmem_plain_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->props = ivshmem_plain_properties;
> +    dc->vmsd = &ivshmem_plain_vmsd;
> +}
> +
> +static const TypeInfo ivshmem_plain_info = {
> +    .name          = TYPE_IVSHMEM_PLAIN,
> +    .parent        = TYPE_IVSHMEM_COMMON,
> +    .instance_size = sizeof(IVShmemState),
> +    .instance_init = ivshmem_plain_init,
> +    .class_init    = ivshmem_plain_class_init,
> +};
> +
> +static const VMStateDescription ivshmem_doorbell_vmsd = {
> +    .name = TYPE_IVSHMEM_DOORBELL,
> +    .version_id = 0,
> +    .minimum_version_id = 0,
> +    .pre_load = ivshmem_pre_load,
> +    .post_load = ivshmem_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
> +        VMSTATE_MSIX(parent_obj, IVShmemState),
> +        VMSTATE_UINT32(intrstatus, IVShmemState),
> +        VMSTATE_UINT32(intrmask, IVShmemState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
> +static Property ivshmem_doorbell_properties[] = {
> +    DEFINE_PROP_CHR("chardev", IVShmemState, server_chr),
> +    DEFINE_PROP_UINT32("vectors", IVShmemState, vectors, 1),
> +    DEFINE_PROP_BIT("ioeventfd", IVShmemState, features, IVSHMEM_IOEVENTFD,
> +                    true),
> +    DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void ivshmem_doorbell_init(Object *obj)
> +{
> +    IVShmemState *s = IVSHMEM_DOORBELL(obj);
> +
> +    s->features |= (1 << IVSHMEM_MSI);
> +    s->legacy_size = SIZE_MAX;  /* whatever the server sends */
> +}
> +
> +static void ivshmem_doorbell_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->props = ivshmem_doorbell_properties;
> +    dc->vmsd = &ivshmem_doorbell_vmsd;
> +}
> +
> +static const TypeInfo ivshmem_doorbell_info = {
> +    .name          = TYPE_IVSHMEM_DOORBELL,
> +    .parent        = TYPE_IVSHMEM_COMMON,
> +    .instance_size = sizeof(IVShmemState),
> +    .instance_init = ivshmem_doorbell_init,
> +    .class_init    = ivshmem_doorbell_class_init,
> +};
> +
>  static void ivshmem_register_types(void)
>  {
> +    type_register_static(&ivshmem_common_info);
> +    type_register_static(&ivshmem_plain_info);
> +    type_register_static(&ivshmem_doorbell_info);
>      type_register_static(&ivshmem_info);
>  }
>
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index 8afbbcd..0dd01c7 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -1262,13 +1262,18 @@ basic example.
>
>  @subsection Inter-VM Shared Memory device
>
> -With KVM enabled on a Linux host, a shared memory device is available.  
> Guests
> -map a POSIX shared memory region into the guest as a PCI device that enables
> -zero-copy communication to the application level of the guests.  The basic
> -syntax is:
> +On Linux hosts, a shared memory device is available.  The basic syntax
> +is:
>
>  @example
> -qemu-system-i386 -device ivshmem,size=@var{size},shm=@var{shm-name}
> +qemu-system-x86_64 -device ivshmem-plain,memdev=@var{hostmem}
> +@end example
> +
> +where @var{hostmem} names a host memory backend.  For a POSIX shared
> +memory backend, use something like
> +
> +@example
> +-object 
> memory-backend-file,size=1M,share,mem-path=/dev/shm/ivshmem,id=@var{hostmem}
>  @end example
>
>  If desired, interrupts can be sent between guest VMs accessing the same 
> shared
> @@ -1282,8 +1287,7 @@ memory server is:
>  ivshmem-server -p @var{pidfile} -S @var{path} -m @var{shm-name} -l 
> @var{shm-size} -n @var{vectors}
>
>  # Then start your qemu instances with matching arguments
> -qemu-system-i386 -device 
> ivshmem,size=@var{shm-size},vectors=@var{vectors},chardev=@var{id}
> -                 [,msi=on][,ioeventfd=on][,role=peer|master]
> +qemu-system-x86_64 -device 
> ivshmem-doorbell,vectors=@var{vectors},chardev=@var{id}
>                   -chardev socket,path=@var{path},id=@var{id}
>  @end example
>
> @@ -1291,12 +1295,11 @@ When using the server, the guest will be assigned a 
> VM ID (>=0) that allows gues
>  using the same server to communicate via interrupts.  Guests can read their
>  VM ID from a device register (see ivshmem-spec.txt).
>
> -The @option{role} argument can be set to either master or peer and will 
> affect
> -how the shared memory is migrated.  With @option{role=master}, the guest will
> -copy the shared memory on migration to the destination host.  With
> -@option{role=peer}, the guest will not be able to migrate with the device 
> attached.
> -With the @option{peer} case, the device should be detached and then 
> reattached
> -after migration using the PCI hotplug support.
> +With device property @option{master=on}, the guest will copy the shared
> +memory on migration to the destination host.  With @option{master=off},
> +the guest will not be able to migrate with the device attached.  In the
> +latter case, the device should be detached and then reattached after
> +migration using the PCI hotplug support.
>
>  @subsubsection ivshmem and hugepages
>
> @@ -1304,8 +1307,8 @@ Instead of specifying the <shm size> using POSIX shm, 
> you may specify
>  a memory backend that has hugepage support:
>
>  @example
> -qemu-system-i386 -object 
> memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1
> -                 -device ivshmem,x-memdev=mb1
> +qemu-system-x86_64 -object 
> memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1
> +                 -device ivshmem-plain,memdev=mb1
>  @end example
>
>  ivshmem-server also supports hugepages mount points with the
> diff --git a/tests/ivshmem-test.c b/tests/ivshmem-test.c
> index 68d6840..891b6b8 100644
> --- a/tests/ivshmem-test.c
> +++ b/tests/ivshmem-test.c
> @@ -127,7 +127,9 @@ static void setup_vm_cmd(IVState *s, const char *cmd, 
> bool msix)
>
>  static void setup_vm(IVState *s)
>  {
> -    char *cmd = g_strdup_printf("-device ivshmem,shm=%s,size=1M", tmpshm);
> +    char *cmd = g_strdup_printf("-object memory-backend-file"
> +                                ",id=mb1,size=1M,share,mem-path=/dev/shm%s"
> +                                " -device ivshmem-plain,memdev=mb1", tmpshm);
>
>      setup_vm_cmd(s, cmd, false);
>
> @@ -284,8 +286,10 @@ static void *server_thread(void *data)
>  static void setup_vm_with_server(IVState *s, int nvectors, bool msi)
>  {
>      char *cmd = g_strdup_printf("-chardev socket,id=chr0,path=%s,nowait "
> -                                "-device 
> ivshmem,size=1M,chardev=chr0,vectors=%d,msi=%s",
> -                                tmpserver, nvectors, msi ? "true" : "false");
> +                                "-device ivshmem%s,chardev=chr0,vectors=%d",
> +                                tmpserver,
> +                                msi ? "-doorbell" : ",size=1M,msi=off",
> +                                nvectors);
>
>      setup_vm_cmd(s, cmd, msi);
>
> @@ -412,7 +416,7 @@ static void test_ivshmem_memdev(void)
>
>      /* just for the sake of checking memory-backend property */
>      setup_vm_cmd(&state, "-object memory-backend-ram,size=1M,id=mb1"
> -                 " -device ivshmem,x-memdev=mb1", false);
> +                 " -device ivshmem-plain,memdev=mb1", false);
>
>      cleanup_vm(&state);
>  }
> --
> 2.4.3
>
>



-- 
Marc-André Lureau

Reply via email to