Hi Nikos,

On 6/19/19 5:14 PM, Nikos Dragazis wrote:
Hi everyone,

this patch series introduces the concept of the virtio-vhost-user
transport. This is actually a revised version of an earlier RFC
implementation that has been proposed by Stefan Hajnoczi [1]. Though
this is a great feature, it seems to have been stalled, so I’d like to
restart the conversation on this and hopefully get it merged with your
help. Let me give you an overview.

Thanks for taking over the series!

I think you are already aware of that, but it arrives too late to
consider it for v19.08, as the proposal deadline is over by almost 3
weeks.

That said, it is good that you sent it early, so that we can work to
make it in for v19.11.

The virtio-vhost-user transport is a vhost-user transport implementation
that is based on the virtio-vhost-user device. Its key difference with
the existing transport is that it allows deploying vhost-user targets
inside dedicated Storage Appliance VMs instead of host user space. In
other words, it allows having guests that act as vhost-user backends for
other guests.

The virtio-vhost-user device implements the vhost-user control plane
(master-slave communication) as follows:

1. it parses the vhost-user messages from the vhost-user unix domain
    socket and forwards them to the slave guest through virtqueues

2. it maps the vhost memory regions in QEMU’s process address space and
    exposes them to the slave guest as a RAM-backed PCI MMIO region

3. it hooks up doorbells to the callfds. The slave guest can use these
    doorbells to interrupt the master guest driver

The device code has not yet been merged into upstream QEMU, but this is
definitely the end goal.

Could you provide a pointer to the QEMU series, and instructions to test
this new device?

The current state is that we are awaiting for
the approval of the virtio spec.

Ditto, a link to the spec patches would be useful.

I have Cced Darek from the SPDK community who has helped me a lot by
reviewing this series. Note that any device type could be implemented
over this new transport. So, adding the virtio-vhost-user transport in
DPDK would allow using it from SPDK as well.

Getting into the code internals, this patch series makes the following
changes:

1. introduce a generic interface for the transport-specific operations.
    Each of the two available transports, the pre-existing AF_UNIX
    transport and the virtio-vhost-user transport, is going to implement
    this interface. The AF_UNIX-specific code has been extracted from the
    core vhost-user code and is now part of the AF_UNIX transport
    implementation in trans_af_unix.c.

2. introduce the virtio-vhost-user transport. The virtio-vhost-user
    transport requires a driver for the virtio-vhost-user devices. The
    driver along with the transport implementation have been packed into
    a separate library in `drivers/virtio_vhost_user/`. The necessary
    virtio-pci code has been copied from `drivers/net/virtio/`. Some
    additional changes have been made so that the driver can utilize the
    additional resources of the virtio-vhost-user device.

3. update librte_vhost public API to enable choosing transport for each
    new vhost device. Extend the vhost net driver and vhost-scsi example
    application to export this new API to the end user.

The primary changes I did to Stefan’s RFC implementation are the
following:

1. moved postcopy live migration code into trans_af_unix.c. Postcopy
    live migration relies on the userfault fd mechanism, which cannot be
    supported by virtio-vhost-user.

2. moved setup of the log memory region into trans_af_unix.c. Setting up
    the log memory region involves mapping/unmapping guest memory. This
    is an AF_UNIX transport-specific operation.

3. introduced a vhost transport operation for
    process_slave_message_reply()

4. moved the virtio-vhost-user transport/driver into a separate library
    in `drivers/virtio_vhost_user/`. This required making vhost.h and
    vhost_user.h part of librte_vhost public API and exporting some
    private symbols via the version script. This looks better to me that
    just moving the entire librte_vhost into `drivers/`. I am not sure if
    this is the most appropriate solution. I am looking forward to your
    suggestions on this.

I'm not sure this is the right place to put it.

5. made use of the virtio PCI capabilities for the additional device
    resources (doorbells, shared memory). This required changes in
    virtio_pci.c and trans_virtio_vhost_user.c.

6. [minor] changed some commit headlines to comply with
    check-git-log.sh.

Please, have a look and let me know about your thoughts. Any
reviews/pointers/suggestions are welcome.

Maxime

Reply via email to