Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support

Ouyang, Changchun Mon, 24 Aug 2015 20:30:07 -0700

Hi Michael,

> -----Original Message-----
> From: snabb-de...@googlegroups.com [mailto:snabb-
> de...@googlegroups.com] On Behalf Of Michael S. Tsirkin
> Sent: Thursday, August 13, 2015 5:19 PM
> To: Ouyang, Changchun
> Cc: qemu-devel@nongnu.org; snabb-de...@googlegroups.com;
> thibaut.col...@6wind.com; n.nikol...@virtualopensystems.com;
> l...@snabb.co; Long, Thomas
> Subject: [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue
> support
> 
> On Wed, Aug 12, 2015 at 02:25:41PM +0800, Ouyang Changchun wrote:
> > Based on patch by Nikolay Nikolaev:
> > Vhost-user will implement the multi queue support in a similar way to
> > what vhost already has - a separate thread for each queue.
> > To enable the multi queue functionality - a new command line parameter
> > "queues" is introduced for the vhost-user netdev.
> >
> > The RESET_OWNER change is based on commit:
> >    294ce717e0f212ed0763307f3eab72b4a1bdf4d0
> > If it is reverted, the patch need update for it accordingly.
> >
> > Signed-off-by: Nikolay Nikolaev <n.nikol...@virtualopensystems.com>
> > Signed-off-by: Changchun Ouyang <changchun.ouy...@intel.com>
> > ---
> > Changes since v5:
> >  - fix the message descption for VHOST_RESET_OWNER in vhost-user txt
> >
> > Changes since v4:
> >  - remove the unnecessary trailing '\n'
> >
> > Changes since v3:
> >  - fix one typo and wrap one long line
> >
> > Changes since v2:
> >  - fix vq index issue for set_vring_call
> >    When it is the case of VHOST_SET_VRING_CALL, The vq_index is not
> initialized before it is used,
> >    thus it could be a random value. The random value leads to crash in vhost
> after passing down
> >    to vhost, as vhost use this random value to index an array index.
> >  - fix the typo in the doc and description
> >  - address vq index for reset_owner
> >
> > Changes since v1:
> >  - use s->nc.info_str when bringing up/down the backend
> >
> >  docs/specs/vhost-user.txt |  7 ++++++-
> >  hw/net/vhost_net.c        |  3 ++-
> >  hw/virtio/vhost-user.c    | 11 ++++++++++-
> >  net/vhost-user.c          | 37 ++++++++++++++++++++++++-------------
> >  qapi-schema.json          |  6 +++++-
> >  qemu-options.hx           |  5 +++--
> >  6 files changed, 50 insertions(+), 19 deletions(-)
> >
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 70da3b1..9390f89 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -135,6 +135,11 @@ As older slaves don't support negotiating
> > protocol features,  a feature bit was dedicated for this purpose:
> >  #define VHOST_USER_F_PROTOCOL_FEATURES 30
> >
> > +Multi queue support
> > +-------------------
> > +The protocol supports multiple queues by setting all index fields in
> > +the sent messages to a properly calculated value.
> > +
> >  Message types
> >  -------------
> >
> > @@ -198,7 +203,7 @@ Message types
> >
> >        Id: 4
> >        Equivalent ioctl: VHOST_RESET_OWNER
> > -      Master payload: N/A
> > +      Master payload: vring state description
> >
> >        Issued when a new connection is about to be closed. The Master will 
> > no
> >        longer own this connection (and will usually close it).
> 
> This is an interface change, isn't it?
> We can't make it unconditionally, need to make it dependent on a protocol
> flag.
> 
> 
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index
> > 1f25cb3..9cd6c05 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -159,6 +159,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > *options)
> >
> >      net->dev.nvqs = 2;
> >      net->dev.vqs = net->vqs;
> > +    net->dev.vq_index = net->nc->queue_index;
> >
> >      r = vhost_dev_init(&net->dev, options->opaque,
> >                         options->backend_type, options->force); @@
> > -269,7 +270,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
> >          for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
> >              const VhostOps *vhost_ops = net->dev.vhost_ops;
> >              int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > -                                          NULL);
> > +                                          &file);
> >              assert(r >= 0);
> >          }
> >      }
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index
> > 27ba035..fb11d4c 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -219,7 +219,12 @@ static int vhost_user_call(struct vhost_dev *dev,
> unsigned long int request,
> >          break;
> >
> >      case VHOST_USER_SET_OWNER:
> > +        break;
> > +
> >      case VHOST_USER_RESET_OWNER:
> > +        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > +        msg.state.index += dev->vq_index;
> > +        msg.size = sizeof(m.state);
> >          break;
> >
> >      case VHOST_USER_SET_MEM_TABLE:
> > @@ -262,17 +267,20 @@ static int vhost_user_call(struct vhost_dev *dev,
> unsigned long int request,
> >      case VHOST_USER_SET_VRING_NUM:
> >      case VHOST_USER_SET_VRING_BASE:
> >          memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > +        msg.state.index += dev->vq_index;
> >          msg.size = sizeof(m.state);
> >          break;
> >
> >      case VHOST_USER_GET_VRING_BASE:
> >          memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > +        msg.state.index += dev->vq_index;
> >          msg.size = sizeof(m.state);
> >          need_reply = 1;
> >          break;
> >
> >      case VHOST_USER_SET_VRING_ADDR:
> >          memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> > +        msg.addr.index += dev->vq_index;
> >          msg.size = sizeof(m.addr);
> >          break;
> >
> > @@ -280,7 +288,7 @@ static int vhost_user_call(struct vhost_dev *dev,
> unsigned long int request,
> >      case VHOST_USER_SET_VRING_CALL:
> >      case VHOST_USER_SET_VRING_ERR:
> >          file = arg;
> > -        msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
> > +        msg.u64 = (file->index + dev->vq_index) &
> > + VHOST_USER_VRING_IDX_MASK;
> >          msg.size = sizeof(m.u64);
> >          if (ioeventfd_enabled() && file->fd > 0) {
> >              fds[fd_num++] = file->fd; @@ -322,6 +330,7 @@ static int
> > vhost_user_call(struct vhost_dev *dev, unsigned long int request,
> >                  error_report("Received bad msg size.");
> >                  return -1;
> >              }
> > +            msg.state.index -= dev->vq_index;
> >              memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
> >              break;
> >          default:
> > diff --git a/net/vhost-user.c b/net/vhost-user.c index
> > 1d86a2b..904d8af 100644
> > --- a/net/vhost-user.c
> > +++ b/net/vhost-user.c
> > @@ -121,35 +121,39 @@ static void net_vhost_user_event(void *opaque,
> int event)
> >      case CHR_EVENT_OPENED:
> >          vhost_user_start(s);
> >          net_vhost_link_down(s, false);
> > -        error_report("chardev \"%s\" went up", s->chr->label);
> > +        error_report("chardev \"%s\" went up", s->nc.info_str);
> >          break;
> >      case CHR_EVENT_CLOSED:
> >          net_vhost_link_down(s, true);
> >          vhost_user_stop(s);
> > -        error_report("chardev \"%s\" went down", s->chr->label);
> > +        error_report("chardev \"%s\" went down", s->nc.info_str);
> >          break;
> >      }
> >  }
> >
> >  static int net_vhost_user_init(NetClientState *peer, const char *device,
> > -                               const char *name, CharDriverState *chr)
> > +                               const char *name, CharDriverState *chr,
> > +                               uint32_t queues)
> >  {
> >      NetClientState *nc;
> >      VhostUserState *s;
> > +    int i;
> >
> > -    nc = qemu_new_net_client(&net_vhost_user_info, peer, device,
> name);
> > +    for (i = 0; i < queues; i++) {
> > +        nc = qemu_new_net_client(&net_vhost_user_info, peer, device,
> > + name);
> >
> > -    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s",
> > -             chr->label);
> > +        snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to %s",
> > +                 i, chr->label);
> >
> > -    s = DO_UPCAST(VhostUserState, nc, nc);
> > +        s = DO_UPCAST(VhostUserState, nc, nc);
> >
> > -    /* We don't provide a receive callback */
> > -    s->nc.receive_disabled = 1;
> > -    s->chr = chr;
> > -
> > -    qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event,
> s);
> > +        /* We don't provide a receive callback */
> > +        s->nc.receive_disabled = 1;
> > +        s->chr = chr;
> > +        s->nc.queue_index = i;
> >
> > +        qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event,
> s);
> > +    }
> >      return 0;
> >  }
> >
> > @@ -225,6 +229,7 @@ static int net_vhost_check_net(QemuOpts *opts,
> > void *opaque)
> 
> 
> There are two problems here:
> 
> 1. we don't really know that the backend
>    is able to support the requested number of queues.
>    If not, everything will fail, silently.
>    A new message to query the # of queues could help, though
>    I'm not sure what can be done on failure. Fail connection?
> 
> 2. each message (e.g. set memory table) is sent multiple times,
>    on the same socket.
> 
I think it is tough to resolve these 2 comments, as the current message is 
either vhost-dev based or virt-queue based,
The multiple queues(pair) feature use multiple vhost-devs to implement itself.
For #1
So the queue number is something should be seen in the upper level of vhost-dev 
rather than inside the vhost-dev.
For each vhost-net, there are 2 virt-queues, one is for Rx the other is for Tx.
introduce the virt-queue pair number into the vhost-dev? But I don't think it 
is good, as for each vhost-dev, there is only one
virt-queue pair.


Where should I put the virt-queue pair number to? I don't get the perfect 
answer till now. Any suggestion is welcome.

Could we assume the vhost backend has the ability to create enough virt-queue 
pair(e.g. 0x8000 is the max) if qemu require
Vhost backend to do it. If it is correct, we don't need get virt-queue pair 
number from vhost backend, as vhost backend can
Create all virt-queue pair required by qemu.
The virtio frontend(on guest) has the flexibility to enable which virt-queue 
according to its own capability, qemu can do it by using
Set_vring_flag message to notify vhost backend.   

For #2 
The memory table message is also vhost-dev based, it wouldn't hurt we send it a 
few times, vhost backend could
Keep it vhost-dev based too, or keep it once(keep it when first time and ignore 
in rest messages from the same connected-fd)
Any other good suggestion is welcome too :-)

> 
> 
> >  int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> >                          NetClientState *peer)  {
> > +    uint32_t queues;
> >      const NetdevVhostUserOptions *vhost_user_opts;
> >      CharDriverState *chr;
> >
> > @@ -243,6 +248,12 @@ int net_init_vhost_user(const NetClientOptions
> *opts, const char *name,
> >          return -1;
> >      }
> >
> > +    /* number of queues for multiqueue */
> > +    if (vhost_user_opts->has_queues) {
> > +        queues = vhost_user_opts->queues;
> > +    } else {
> > +        queues = 1;
> > +    }
> >
> > -    return net_vhost_user_init(peer, "vhost_user", name, chr);
> > +    return net_vhost_user_init(peer, "vhost_user", name, chr,
> > + queues);
> >  }
> > diff --git a/qapi-schema.json b/qapi-schema.json index
> > f97ffa1..51e40ce 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -2444,12 +2444,16 @@
> >  #
> >  # @vhostforce: #optional vhost on for non-MSIX virtio guests (default:
> false).
> >  #
> > +# @queues: #optional number of queues to be created for multiqueue
> vhost-user
> > +#          (default: 1) (Since 2.5)
> > +#
> >  # Since 2.1
> >  ##
> >  { 'struct': 'NetdevVhostUserOptions',
> >    'data': {
> >      'chardev':        'str',
> > -    '*vhostforce':    'bool' } }
> > +    '*vhostforce':    'bool',
> > +    '*queues':        'uint32' } }
> >
> >  ##
> >  # @NetClientOptions
> > diff --git a/qemu-options.hx b/qemu-options.hx index ec356f6..dad035e
> > 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -1942,13 +1942,14 @@ The hubport netdev lets you connect a NIC to a
> > QEMU "vlan" instead of a single  netdev.  @code{-net} and
> > @code{-device} with parameter @option{vlan} create the  required hub
> automatically.
> >
> > -@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off]
> > +@item -netdev
> > +vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n]
> >
> >  Establish a vhost-user netdev, backed by a chardev @var{id}. The
> > chardev should  be a unix domain socket backed one. The vhost-user
> > uses a specifically defined  protocol to pass vhost ioctl replacement
> > messages to an application on the other  end of the socket. On
> > non-MSIX guests, the feature can be forced with -@var{vhostforce}.
> > +@var{vhostforce}. Use 'queues=@var{n}' to specify the number of
> > +queues to be created for multiqueue vhost-user.
> >
> >  Example:
> >  @example
> > --
> > 1.8.4.2
> >
> 
> --
> You received this message because you are subscribed to the Google Groups
> "Snabb Switch development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to snabb-devel+unsubscr...@googlegroups.com.
> To post to this group, send an email to snabb-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/snabb-devel.

Re: [Qemu-devel] [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue support

Reply via email to