Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-31 Thread Liu, Changpeng


> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Monday, July 31, 2017 11:41 PM
> To: Stefan Hajnoczi ; Liu, Changpeng
> 
> Cc: qemu-devel@nongnu.org; fel...@nutanix.com; m...@redhat.com; Marc-
> André Lureau 
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On 31/07/2017 16:51, Stefan Hajnoczi wrote:
> > typedef enum VhostUserRequest {
> > ...
> >
> > /* Submitted by the vhost-user master when the guest writes to
> >  * virtio config space and also after live migration on the
> >  * destination host.
> >  */
> > VHOST_USER_SET_CONFIG,
> >
> > /* Submitted by the vhost-user master to fetch the contents of the
> >  * virtio config space.  The vhost-user master may cache the
> >  * contents to avoid repeated VHOST_USER_GET_CONFIG calls.
> >  */
> > VHOST_USER_GET_CONFIG,
> >
> > ...
> > };
> 
> Also:
> 
>   /* Submitted by the vhost-user master if it would like to
>* be informed of virtio config space changes.   The slave
>* signals the eventfd whenever config space is modified.
>*/
>   VHOST_USER_SET_CONFIG_FD,
> 
> Paolo
> 
Thanks Stefan and Paolo, looks much cleaner now.
> > struct VuDev {
> > ...
> > int config_change_fd;
> > ...
> > };




Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-31 Thread Paolo Bonzini
On 31/07/2017 16:51, Stefan Hajnoczi wrote:
> typedef enum VhostUserRequest {
> ...
> 
> /* Submitted by the vhost-user master when the guest writes to
>  * virtio config space and also after live migration on the
>  * destination host.
>  */
> VHOST_USER_SET_CONFIG,
> 
> /* Submitted by the vhost-user master to fetch the contents of the
>  * virtio config space.  The vhost-user master may cache the
>  * contents to avoid repeated VHOST_USER_GET_CONFIG calls.
>  */
> VHOST_USER_GET_CONFIG,
> 
> ...
> };

Also:

/* Submitted by the vhost-user master if it would like to
 * be informed of virtio config space changes.   The slave
 * signals the eventfd whenever config space is modified.
 */
VHOST_USER_SET_CONFIG_FD,

Paolo

> struct VuDev {
> ...
> int config_change_fd;
> ...
> };




Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-31 Thread Stefan Hajnoczi
On Sat, Jul 29, 2017 at 03:21:16AM +, Liu, Changpeng wrote:
> 
> 
> > -Original Message-
> > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > Sent: Friday, July 28, 2017 6:36 PM
> > To: Liu, Changpeng 
> > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> > m...@redhat.com
> > Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk 
> > host device
> > 
> > On Thu, Jul 27, 2017 at 10:08:49AM +, Liu, Changpeng wrote:
> > > > -Original Message-
> > > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > > Sent: Thursday, July 27, 2017 5:49 PM
> > > > To: Liu, Changpeng 
> > > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> > > > m...@redhat.com
> > > > Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk 
> > > > host
> > > > device
> > > >
> > > > On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > > > > +.fields = (VMStateField[]) {
> > > > > > > +VMSTATE_VIRTIO_DEVICE,
> > > > > > > +VMSTATE_END_OF_LIST()
> > > > > > > +},
> > > > > > > +};
> > > > > > > +
> > > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > > > > >
> > > > > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes 
> > > > > > the
> > > > > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > > > > drive= parameter.
> > > > > >
> > > > > > Also, parameters like the discard granularity, optimal I/O size, 
> > > > > > logical
> > > > > > block size, etc can be initialized to sensible defaults by QEMU's 
> > > > > > block
> > > > > > layer when drive= is used.  Since you are bypassing QEMU's block 
> > > > > > layer
> > > > > > there is no way for QEMU to set good defaults.
> > > > > >
> > > > > > Are you relying on the managment tools populating these parameters 
> > > > > > with
> > > > > > the correct values from the vhost-user-blk process?  Or did you have
> > > > > > some other mechanism in mind?
> > > > > Actually for this part and your comments about block resize, I think 
> > > > > maybe
> > add
> > > > several
> > > > > additional vhost user messages such like:
> > > > VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > > > > makes the code more clear, I'm okay to add such messages.
> > > >
> > > > New messages for virtio config space read/write might be useful for
> > > > other devices besides virtio-blk.
> > > I mean all the block device information like: block_size/capacity are 
> > > stored in
> > another process,
> > > so add a new vhost user message to Qemu vhost-user-blk can get those
> > information when Qemu get
> > > started, once those messages stored to virtio_pci config space, we will 
> > > not
> > change it.
> > 
> > No, I think updates are necessary:
> > 
> > The virtio block device can do online disk resize, so it will be
> > necessary to change the capacity field from the vhost-user-blk process
> > at runtime and raise a config change notification interrupt.
> > 
> > The virtio block device also has a config space field ("wce") that is
> > writable by the guest.  Supporting this feature also requires virtio
> > config space support in vhost-user.
> > 
> > If you focus on adding generic virtio config space messages to
> > vhost-user then these virtio block features can be supported by
> > vhost-user-blk.
> > 
> > Regarding the two approaches of adding "block device information" as you
> > have suggested versus adding generic virtio config space support as I'm
> > suggesting, from a protocol design perspective it's nicer to have
> > generic messages that are usable by all device types.  I'm not aware of
> > a reason why high-level "block device information" is needed since QEMU
> > will just put that into the config space but otherwise does not
> > interpret the information.
> Agreed, adding a vhost message  to get/set generic vitio_pci device config 
> space 
> is clear to me now, it's better than only for vhost-user-blk messages.
> 
> Here I still have an concern about *resize* feature for vhost-user-blk, 
> currently
> Qemu vhost-user-blk is the client for vhost user messages, does this means
> the I/O processing process must send a new vhost message back to Qemu
> vhost-user-blk driver that the capacity of the block device is changed? Or you
> have better idea to do this? Thanks.

A vhost-user process -> vhost-user master message is not necessary.  An
eventfd can be used to signal config changes instead.

I have CCed Marc-André because I don't know much about the vhost-user
protocol design.

Here is what I suggest for virtio config space:

typedef enum VhostUserRequest {
...

/* Submitted by the vhost-user master when the guest writes to
 * virtio config space and also after live migration on the
 * 

Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-28 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Friday, July 28, 2017 6:36 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host 
> device
> 
> On Thu, Jul 27, 2017 at 10:08:49AM +, Liu, Changpeng wrote:
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > Sent: Thursday, July 27, 2017 5:49 PM
> > > To: Liu, Changpeng 
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> > > m...@redhat.com
> > > Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk 
> > > host
> > > device
> > >
> > > On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > > > +.fields = (VMStateField[]) {
> > > > > > +VMSTATE_VIRTIO_DEVICE,
> > > > > > +VMSTATE_END_OF_LIST()
> > > > > > +},
> > > > > > +};
> > > > > > +
> > > > > > +static Property vhost_user_blk_properties[] = {
> > > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > > > >
> > > > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes 
> > > > > the
> > > > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > > > drive= parameter.
> > > > >
> > > > > Also, parameters like the discard granularity, optimal I/O size, 
> > > > > logical
> > > > > block size, etc can be initialized to sensible defaults by QEMU's 
> > > > > block
> > > > > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > > > > there is no way for QEMU to set good defaults.
> > > > >
> > > > > Are you relying on the managment tools populating these parameters 
> > > > > with
> > > > > the correct values from the vhost-user-blk process?  Or did you have
> > > > > some other mechanism in mind?
> > > > Actually for this part and your comments about block resize, I think 
> > > > maybe
> add
> > > several
> > > > additional vhost user messages such like:
> > > VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > > > makes the code more clear, I'm okay to add such messages.
> > >
> > > New messages for virtio config space read/write might be useful for
> > > other devices besides virtio-blk.
> > I mean all the block device information like: block_size/capacity are 
> > stored in
> another process,
> > so add a new vhost user message to Qemu vhost-user-blk can get those
> information when Qemu get
> > started, once those messages stored to virtio_pci config space, we will not
> change it.
> 
> No, I think updates are necessary:
> 
> The virtio block device can do online disk resize, so it will be
> necessary to change the capacity field from the vhost-user-blk process
> at runtime and raise a config change notification interrupt.
> 
> The virtio block device also has a config space field ("wce") that is
> writable by the guest.  Supporting this feature also requires virtio
> config space support in vhost-user.
> 
> If you focus on adding generic virtio config space messages to
> vhost-user then these virtio block features can be supported by
> vhost-user-blk.
> 
> Regarding the two approaches of adding "block device information" as you
> have suggested versus adding generic virtio config space support as I'm
> suggesting, from a protocol design perspective it's nicer to have
> generic messages that are usable by all device types.  I'm not aware of
> a reason why high-level "block device information" is needed since QEMU
> will just put that into the config space but otherwise does not
> interpret the information.
Agreed, adding a vhost message  to get/set generic vitio_pci device config 
space 
is clear to me now, it's better than only for vhost-user-blk messages.

Here I still have an concern about *resize* feature for vhost-user-blk, 
currently
Qemu vhost-user-blk is the client for vhost user messages, does this means
the I/O processing process must send a new vhost message back to Qemu
vhost-user-blk driver that the capacity of the block device is changed? Or you
have better idea to do this? Thanks.
> 
> > > It's worth checking how virtio config space is live migrated and how to
> > > do that consistently if the vhost-user process is involved.
> > Virtio will config spaces for virtio_blk, and even the config space 
> > migrated to
> another machine, it should be
> > same with the another I/O process, did I get your comments ? Thanks.
> 
> The guest-writable "wce" config space field is an example that shows the
> vhost-user-blk process on the destination host does not have all the
> state necessary.  QEMU needs to migrate the config space and send it to
> the vhost-user-blk process on the destination.
> 
> Stefan



Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-28 Thread Stefan Hajnoczi
On Thu, Jul 27, 2017 at 10:08:49AM +, Liu, Changpeng wrote:
> > -Original Message-
> > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > Sent: Thursday, July 27, 2017 5:49 PM
> > To: Liu, Changpeng 
> > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> > m...@redhat.com
> > Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> > device
> > 
> > On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > > +.fields = (VMStateField[]) {
> > > > > +VMSTATE_VIRTIO_DEVICE,
> > > > > +VMSTATE_END_OF_LIST()
> > > > > +},
> > > > > +};
> > > > > +
> > > > > +static Property vhost_user_blk_properties[] = {
> > > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > > >
> > > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
> > > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > > drive= parameter.
> > > >
> > > > Also, parameters like the discard granularity, optimal I/O size, logical
> > > > block size, etc can be initialized to sensible defaults by QEMU's block
> > > > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > > > there is no way for QEMU to set good defaults.
> > > >
> > > > Are you relying on the managment tools populating these parameters with
> > > > the correct values from the vhost-user-blk process?  Or did you have
> > > > some other mechanism in mind?
> > > Actually for this part and your comments about block resize, I think 
> > > maybe add
> > several
> > > additional vhost user messages such like:
> > VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > > makes the code more clear, I'm okay to add such messages.
> > 
> > New messages for virtio config space read/write might be useful for
> > other devices besides virtio-blk.
> I mean all the block device information like: block_size/capacity are stored 
> in another process, 
> so add a new vhost user message to Qemu vhost-user-blk can get those 
> information when Qemu get
> started, once those messages stored to virtio_pci config space, we will not 
> change it.

No, I think updates are necessary:

The virtio block device can do online disk resize, so it will be
necessary to change the capacity field from the vhost-user-blk process
at runtime and raise a config change notification interrupt.

The virtio block device also has a config space field ("wce") that is
writable by the guest.  Supporting this feature also requires virtio
config space support in vhost-user.

If you focus on adding generic virtio config space messages to
vhost-user then these virtio block features can be supported by
vhost-user-blk.

Regarding the two approaches of adding "block device information" as you
have suggested versus adding generic virtio config space support as I'm
suggesting, from a protocol design perspective it's nicer to have
generic messages that are usable by all device types.  I'm not aware of
a reason why high-level "block device information" is needed since QEMU
will just put that into the config space but otherwise does not
interpret the information.

> > It's worth checking how virtio config space is live migrated and how to
> > do that consistently if the vhost-user process is involved.
> Virtio will config spaces for virtio_blk, and even the config space migrated 
> to another machine, it should be
> same with the another I/O process, did I get your comments ? Thanks. 

The guest-writable "wce" config space field is an example that shows the
vhost-user-blk process on the destination host does not have all the
state necessary.  QEMU needs to migrate the config space and send it to
the vhost-user-blk process on the destination.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-27 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Thursday, July 27, 2017 5:49 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_VIRTIO_DEVICE,
> > > > +VMSTATE_END_OF_LIST()
> > > > +},
> > > > +};
> > > > +
> > > > +static Property vhost_user_blk_properties[] = {
> > > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > >
> > > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
> > > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > > drive= parameter.
> > >
> > > Also, parameters like the discard granularity, optimal I/O size, logical
> > > block size, etc can be initialized to sensible defaults by QEMU's block
> > > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > > there is no way for QEMU to set good defaults.
> > >
> > > Are you relying on the managment tools populating these parameters with
> > > the correct values from the vhost-user-blk process?  Or did you have
> > > some other mechanism in mind?
> > Actually for this part and your comments about block resize, I think maybe 
> > add
> several
> > additional vhost user messages such like:
> VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> > makes the code more clear, I'm okay to add such messages.
> 
> New messages for virtio config space read/write might be useful for
> other devices besides virtio-blk.
I mean all the block device information like: block_size/capacity are stored in 
another process, 
so add a new vhost user message to Qemu vhost-user-blk can get those 
information when Qemu get
started, once those messages stored to virtio_pci config space, we will not 
change it.
> 
> It's worth checking how virtio config space is live migrated and how to
> do that consistently if the vhost-user process is involved.
Virtio will config spaces for virtio_blk, and even the config space migrated to 
another machine, it should be
same with the another I/O process, did I get your comments ? Thanks. 
> 
> Stefan



Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-27 Thread Stefan Hajnoczi
On Thu, Jul 27, 2017 at 12:29:45AM +, Liu, Changpeng wrote:
> > > +.fields = (VMStateField[]) {
> > > +VMSTATE_VIRTIO_DEVICE,
> > > +VMSTATE_END_OF_LIST()
> > > +},
> > > +};
> > > +
> > > +static Property vhost_user_blk_properties[] = {
> > > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> > 
> > DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
> > 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> > drive= parameter.
> > 
> > Also, parameters like the discard granularity, optimal I/O size, logical
> > block size, etc can be initialized to sensible defaults by QEMU's block
> > layer when drive= is used.  Since you are bypassing QEMU's block layer
> > there is no way for QEMU to set good defaults.
> > 
> > Are you relying on the managment tools populating these parameters with
> > the correct values from the vhost-user-blk process?  Or did you have
> > some other mechanism in mind?
> Actually for this part and your comments about block resize, I think maybe 
> add several
> additional vhost user messages such like: 
> VHOST_USER_GET_BLK_CFG/VHOST_USER_SET_BLK_CFG
> makes the code more clear, I'm okay to add such messages.

New messages for virtio config space read/write might be useful for
other devices besides virtio-blk.

It's worth checking how virtio config space is live migrated and how to
do that consistently if the vhost-user process is involved.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-26 Thread Liu, Changpeng


> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, July 26, 2017 6:35 PM
> To: Liu, Changpeng 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; fel...@nutanix.com;
> m...@redhat.com
> Subject: Re: [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host
> device
> 
> On Thu, Jul 27, 2017 at 10:00:50AM +0800, Changpeng Liu wrote:
> > +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t
> *config)
> > +{
> > +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > +struct virtio_blk_config blkcfg;
> > +
> > +memcpy(, config, sizeof(blkcfg));
> > +
> > +if (blkcfg.wce != s->config_wce) {
> > +error_report("vhost-user-blk: does not support the operation");
> 
> If vhost-user-blk doesn't support the operation then please remove the
> VIRTIO_BLK_F_CONFIG_WCE feature bit.  That way the guest knows it cannot
> modify this field.
> 
> > +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> > +{
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> > +int ret;
> > +uint64_t size;
> > +
> > +if (!s->chardev.chr) {
> > +error_setg(errp, "vhost-user-blk: chardev is mandatary");
> 
> mandatory
> 
> > +return;
> > +}
> > +
> > +if (!s->num_queues) {
> > +error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> > +return;
> > +}
> > +
> > +if (!s->queue_size) {
> > +error_setg(errp, "vhost-user-blk: invalid count of the IO queue");
> 
> "count of the IO queue" sounds like number of queues.  I suggest saying
> "queue size must be non-zero".
> 
> > +return;
> > +}
> > +
> > +if (!s->size) {
> > +error_setg(errp, "vhost-user-blk: block capacity must be assigned,"
> > +  "size can be specified by GiB or MiB");
> > +return;
> > +}
> > +
> > +ret = qemu_strtosz_MiB(s->size, NULL, );
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-user-blk: invalid size %s in GiB/MiB", 
> > s->size);
> > +return;
> > +}
> > +s->capacity = size / 512;
> > +
> > +/* block size with default 512 Bytes */
> > +if (!s->blkcfg.logical_block_size) {
> > +s->blkcfg.logical_block_size = 512;
> > +}
> > +
> > +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> > +sizeof(struct virtio_blk_config));
> > +virtio_add_queue(vdev, s->queue_size, vhost_user_blk_handle_output);
> > +
> > +s->dev.nvqs = s->num_queues;
> > +s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
> 
> Please test multi-queue, it's currently broken.  virtio_add_queue() must
> be called for each queue.
Yes, should move virtio_add_queue into loop.
> 
> > +s->dev.vq_index = 0;
> > +s->dev.backend_features = 0;
> > +
> > +ret = vhost_dev_init(>dev, (void *)>chardev,
> 
> The compiler automatically converts any pointer type to void * without a
> warning in C.  (This is different from C++!)
> 
> The (void *) cast can be dropped.
> 
> > + VHOST_BACKEND_TYPE_USER, 0);
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> > +   strerror(-ret));
> 
> If realize fails unrealize() will not be called.  Cleanup must be
> performed here.
> 
> > +return;
> > +}
> > +}
> > +
> > +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> > +{
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VHostUserBlk *s = VHOST_USER_BLK(dev);
> > +
> > +vhost_user_blk_set_status(vdev, 0);
> > +vhost_dev_cleanup(>dev);
> > +g_free(s->dev.vqs);
> > +virtio_cleanup(vdev);
> > +}
> > +
> > +static void vhost_user_blk_instance_init(Object *obj)
> > +{
> > +VHostUserBlk *s = VHOST_USER_BLK(obj);
> > +
> > +device_add_bootindex_property(obj, >bootindex, "bootindex",
> > +  "/disk@0,0", DEVICE(obj), NULL);
> > +}
> > +
> > +static const VMStateDescription vmstate_vhost_user_blk = {
> > +.name = "vhost-user-blk",
> > +.minimum_version_id = 2,
> > +.version_id = 2,
> 
> Why is version_id 2?  There has never been a vhost-user-blk device in
> qemu.git before, so I would expect version to be 1.
> 
> > +.fields = (VMStateField[]) {
> > +VMSTATE_VIRTIO_DEVICE,
> > +VMSTATE_END_OF_LIST()
> > +},
> > +};
> > +
> > +static Property vhost_user_blk_properties[] = {
> > +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> > +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),
> 
> DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
> 'drive' (blkcfg.blk) parameter.  The command-line should not allow a
> drive= parameter.
> 
> Also, parameters like the discard granularity, optimal I/O size, logical
> block size, etc can be initialized to sensible defaults by QEMU's block
> layer when 

Re: [Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-26 Thread Stefan Hajnoczi
On Thu, Jul 27, 2017 at 10:00:50AM +0800, Changpeng Liu wrote:
> +static void vhost_user_blk_set_config(VirtIODevice *vdev, const uint8_t 
> *config)
> +{
> +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> +struct virtio_blk_config blkcfg;
> +
> +memcpy(, config, sizeof(blkcfg));
> +
> +if (blkcfg.wce != s->config_wce) {
> +error_report("vhost-user-blk: does not support the operation");

If vhost-user-blk doesn't support the operation then please remove the
VIRTIO_BLK_F_CONFIG_WCE feature bit.  That way the guest knows it cannot
modify this field.

> +static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +VHostUserBlk *s = VHOST_USER_BLK(vdev);
> +int ret;
> +uint64_t size;
> +
> +if (!s->chardev.chr) {
> +error_setg(errp, "vhost-user-blk: chardev is mandatary");

mandatory

> +return;
> +}
> +
> +if (!s->num_queues) {
> +error_setg(errp, "vhost-user-blk: invalid number of IO queues");
> +return;
> +}
> +
> +if (!s->queue_size) {
> +error_setg(errp, "vhost-user-blk: invalid count of the IO queue");

"count of the IO queue" sounds like number of queues.  I suggest saying
"queue size must be non-zero".

> +return;
> +}
> +
> +if (!s->size) {
> +error_setg(errp, "vhost-user-blk: block capacity must be assigned,"
> +  "size can be specified by GiB or MiB");
> +return;
> +}
> +
> +ret = qemu_strtosz_MiB(s->size, NULL, );
> +if (ret < 0) {
> +error_setg(errp, "vhost-user-blk: invalid size %s in GiB/MiB", 
> s->size);
> +return;
> +}
> +s->capacity = size / 512;
> +
> +/* block size with default 512 Bytes */
> +if (!s->blkcfg.logical_block_size) {
> +s->blkcfg.logical_block_size = 512;
> +}
> +
> +virtio_init(vdev, "virtio-blk", VIRTIO_ID_BLOCK,
> +sizeof(struct virtio_blk_config));
> +virtio_add_queue(vdev, s->queue_size, vhost_user_blk_handle_output);
> +
> +s->dev.nvqs = s->num_queues;
> +s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);

Please test multi-queue, it's currently broken.  virtio_add_queue() must
be called for each queue.

> +s->dev.vq_index = 0;
> +s->dev.backend_features = 0;
> +
> +ret = vhost_dev_init(>dev, (void *)>chardev,

The compiler automatically converts any pointer type to void * without a
warning in C.  (This is different from C++!)

The (void *) cast can be dropped.

> + VHOST_BACKEND_TYPE_USER, 0);
> +if (ret < 0) {
> +error_setg(errp, "vhost-user-blk: vhost initialization failed: %s",
> +   strerror(-ret));

If realize fails unrealize() will not be called.  Cleanup must be
performed here.

> +return;
> +}
> +}
> +
> +static void vhost_user_blk_device_unrealize(DeviceState *dev, Error **errp)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +VHostUserBlk *s = VHOST_USER_BLK(dev);
> +
> +vhost_user_blk_set_status(vdev, 0);
> +vhost_dev_cleanup(>dev);
> +g_free(s->dev.vqs);
> +virtio_cleanup(vdev);
> +}
> +
> +static void vhost_user_blk_instance_init(Object *obj)
> +{
> +VHostUserBlk *s = VHOST_USER_BLK(obj);
> +
> +device_add_bootindex_property(obj, >bootindex, "bootindex",
> +  "/disk@0,0", DEVICE(obj), NULL);
> +}
> +
> +static const VMStateDescription vmstate_vhost_user_blk = {
> +.name = "vhost-user-blk",
> +.minimum_version_id = 2,
> +.version_id = 2,

Why is version_id 2?  There has never been a vhost-user-blk device in
qemu.git before, so I would expect version to be 1.

> +.fields = (VMStateField[]) {
> +VMSTATE_VIRTIO_DEVICE,
> +VMSTATE_END_OF_LIST()
> +},
> +};
> +
> +static Property vhost_user_blk_properties[] = {
> +DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
> +DEFINE_BLOCK_PROPERTIES(VHostUserBlk, blkcfg),

DEFINE_BLOCK_PROPERTIES cannot be used directly because it includes the
'drive' (blkcfg.blk) parameter.  The command-line should not allow a
drive= parameter.

Also, parameters like the discard granularity, optimal I/O size, logical
block size, etc can be initialized to sensible defaults by QEMU's block
layer when drive= is used.  Since you are bypassing QEMU's block layer
there is no way for QEMU to set good defaults.

Are you relying on the managment tools populating these parameters with
the correct values from the vhost-user-blk process?  Or did you have
some other mechanism in mind?

> +DEFINE_BLOCK_CHS_PROPERTIES(VHostUserBlk, blkcfg),
> +DEFINE_PROP_STRING("size", VHostUserBlk, size),

virtio-blk supports online resize.  QEMU and the vhost-user-blk process
need a way to exchange disk capacity information for online resize.

Please add the necessary vhost-user protocol messages now so size
information can be automatically exchanged and updated for resize.

> 

[Qemu-devel] [PATCH 1/2] vhost-user-blk: introduce a new vhost-user-blk host device

2017-07-25 Thread Changpeng Liu
This commit introduces a vhost-user device for block, it uses a
chardev to connect to the backend, Same with virito_blk, Guest
OS still uses the virtio_blk frontend driver.

To use it, start Qemu with command line like this:

qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,size=10G, \
logical_block_size=...

Different with exist Qemu virtio_blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. However, due to
exist vhost-user messages, Qemu can't get the parameters such as
capacity of the backend block device, users need to append such
parameters when start Qemu.

Signed-off-by: Changpeng Liu 
---
 configure  |  11 ++
 hw/block/Makefile.objs |   3 +
 hw/block/vhost-user-blk.c  | 352 +
 hw/virtio/virtio-pci.c |  55 ++
 hw/virtio/virtio-pci.h |  18 ++
 include/hw/virtio/vhost-user-blk.h |  46 +
 6 files changed, 485 insertions(+)
 create mode 100644 hw/block/vhost-user-blk.c
 create mode 100644 include/hw/virtio/vhost-user-blk.h

diff --git a/configure b/configure
index 987f59b..466b3b4 100755
--- a/configure
+++ b/configure
@@ -305,6 +305,7 @@ tcg="yes"
 
 vhost_net="no"
 vhost_scsi="no"
+vhost_user_blk="no"
 vhost_vsock="no"
 kvm="no"
 hax="no"
@@ -778,6 +779,7 @@ Linux)
   kvm="yes"
   vhost_net="yes"
   vhost_scsi="yes"
+  vhost_user_blk="yes"
   vhost_vsock="yes"
   QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers 
$QEMU_INCLUDES"
   supported_os="yes"
@@ -1135,6 +1137,10 @@ for opt do
   ;;
   --enable-vhost-scsi) vhost_scsi="yes"
   ;;
+  --disable-vhost-user-blk) vhost_user_blk="no"
+  ;;
+  --enable-vhost-user-blk) vhost_user_blk="yes"
+  ;;
   --disable-vhost-vsock) vhost_vsock="no"
   ;;
   --enable-vhost-vsock) vhost_vsock="yes"
@@ -1489,6 +1495,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   cap-ng  libcap-ng support
   attrattr and xattr support
   vhost-net   vhost-net acceleration support
+  vhost-user-blk  VM virtio-blk acceleration in user space
   spice   spice
   rbd rados block device (rbd)
   libiscsiiscsi support
@@ -5347,6 +5354,7 @@ echo "posix_madvise $posix_madvise"
 echo "libcap-ng support $cap_ng"
 echo "vhost-net support $vhost_net"
 echo "vhost-scsi support $vhost_scsi"
+echo "vhost-user-blk support $vhost_user_blk"
 echo "vhost-vsock support $vhost_vsock"
 echo "Trace backends$trace_backends"
 if have_backend "simple"; then
@@ -5757,6 +5765,9 @@ fi
 if test "$vhost_scsi" = "yes" ; then
   echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
 fi
+if test "$vhost_user_blk" = "yes" ; then
+  echo "CONFIG_VHOST_USER_BLK=y" >> $config_host_mak
+fi
 if test "$vhost_net" = "yes" ; then
   echo "CONFIG_VHOST_NET_USED=y" >> $config_host_mak
 fi
diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index e0ed980..4c19a58 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -13,3 +13,6 @@ obj-$(CONFIG_SH4) += tc58128.o
 
 obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO) += dataplane/
+ifeq ($(CONFIG_VIRTIO),y)
+obj-$(CONFIG_VHOST_USER_BLK) += vhost-user-blk.o
+endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
new file mode 100644
index 000..38d7855
--- /dev/null
+++ b/hw/block/vhost-user-blk.c
@@ -0,0 +1,352 @@
+/*
+ * vhost-user-blk host device
+ *
+ * Copyright IBM, Corp. 2011
+ * Copyright(C) 2017 Intel Corporation.
+ *
+ * Authors:
+ *  Stefan Hajnoczi 
+ *  Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "migration/vmstate.h"
+#include "migration/migration.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/typedefs.h"
+#include "qemu/cutils.h"
+#include "qom/object.h"
+#include "hw/qdev-core.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-user-blk.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+
+static const int user_feature_bits[] = {
+VIRTIO_BLK_F_SIZE_MAX,
+VIRTIO_BLK_F_SEG_MAX,
+VIRTIO_BLK_F_GEOMETRY,
+VIRTIO_BLK_F_BLK_SIZE,
+VIRTIO_BLK_F_TOPOLOGY,
+VIRTIO_BLK_F_SCSI,
+VIRTIO_BLK_F_MQ,
+VIRTIO_BLK_F_RO,
+VIRTIO_BLK_F_FLUSH,
+VIRTIO_BLK_F_BARRIER,
+VIRTIO_BLK_F_CONFIG_WCE,
+VIRTIO_F_VERSION_1,
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VHOST_INVALID_FEATURE_BIT
+};
+
+static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
+{
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+int