from:"Yuanhan Liu"

Re: [Qemu-devel] [dpdk-dev] [ovs-dev] [PATCH RFC] netdev-dpdk: Fix device obtain mac address when received first packet in vhost type

2017-11-27 Thread Yuanhan Liu

On Fri, Nov 24, 2017 at 05:59:09PM +0800, Chen Hailin wrote:
> Hi Aaron Conole && Jianfeng,
> 
> The stp could not work in ovs-dpdk vhostuser.
> Because the attached vhost device doesn't have MAC address.
> 
> Now we have two ways to solve this problem.
> 1. The vhost learns MAC address from packet like as my first patch.

I do agree with Aaron this is not the right way.

> 2. The virtio notifies MAC address actively to vhost user .

Unfortunately, AFAIK, there is no way to achieve that so far. we could
either let virtio/QEMU to expose the CQ to vhost or add a new VHOST_USER
message to carry the mac address. While vhost-user is a generic interface
adding a virtio-net specific message also doesn't seem quite right.
Exposing CQ is probably the best we can do.

Anyway, both need spec change.

--yliu
> 
> In my opinions,  if we treat it as a device,  we should allocate 
> MAC address for the device when the VM started.
> 
> Which one do you think better?
> 
> 
> 
> Best Regards,
> Chen Hailin
> che...@arraynetworks.com.cn
>  
> From: Aaron Conole
> Date: 2017-11-18 10:00
> To: Hailin Chen
> CC: ovs-...@openvswitch.org; Maxime Coquelin; cl...@arraynetworks.com.cn
> Subject: Re: [ovs-dev] [PATCH RFC] netdev-dpdk: Fix device obtain mac address 
> when received first packet in vhost type
> Hi Hailin,
>  
> Hailin Chen  writes:
>  
> > The stp could not work on netdev-dpdk if network is loop.
> > Because the stp protocol negotiates designate port by sending
> > BPDU packets which contains MAC address.
> > However the device doesn't have MAC address in vhostuser type.
> > Thus, function send_bpdu_cb would not send BPDU packets.
> >
> > This patch will set the MAC for device when received first packet.
> >
> > Signed-off-by: Hailin Chen 
> > ---
>  
> Thanks for the patch.
>  
> In general, I don't think this is the right approach to deal with this
> type of issue.  I believe the problem statement is that OvS bridge is
> unaware of the guest MAC address - did I get it right?  In that case, I
> would think that a better way to solve this would be to have virtio tell
> the mac address of the guest.  I don't recall right now if that's
> allowed in the virtio spec, but I do remember some kind of negotiation
> features.
>  
> I've CC'd Maxime, who is one of the maintainers of the virtio code from
> DPDK side.  Perhaps there is an alternate way to solve this.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [Qemu-devel] [RFC v3 1/3] vhost-user: Add MTU protocol feature and op

2016-11-30 Thread Yuanhan Liu

On Wed, Nov 30, 2016 at 11:10:15AM +0100, Maxime Coquelin wrote:
> +#define VHOST_USER_PROTOCOL_F_MTU4
>  
>  Message types
>  -
> @@ -470,6 +471,18 @@ Message types
>The first 6 bytes of the payload contain the mac address of the guest 
> to
>allow the vhost user backend to construct and broadcast the fake RARP.
>  
> + * VHOST_USER_SET_MTU

One thing I have just thought of is that, though we currently have one
vhost-user driver (net), SPDK here (James cc'ed) is proposing vhost-user
for other devices, say SCSI.

So maybe it's better to add some type prefix here, like VHOST_USER_NET_SET_MTU?

--yliu

Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-24 Thread Yuanhan Liu

On Thu, Nov 24, 2016 at 09:30:49AM +, Kevin Traynor wrote:
> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> >>>> You keep assuming that you have the VM started first and
> >>>> figure out things afterwards, but this does not work.
> >>>>
> >>>> Think about a cluster of machines. You want to start a VM in
> >>>> a way that will ensure compatibility with all hosts
> >>>> in a cluster.
> >>>
> >>> I see. I was more considering about the case when the dst
> >>> host (including the qemu and dpdk combo) is given, and
> >>> then determine whether it will be a successfull migration
> >>> or not.
> >>>
> >>> And you are asking that we need to know which host could
> >>> be a good candidate before starting the migration. In such
> >>> case, we indeed need some inputs from both the qemu and
> >>> vhost-user backend.
> >>>
> >>> For DPDK, I think it could be simple, just as you said, it
> >>> could be either a tiny script, or even a macro defined in
> >>> the source code file (we extend it every time we add a
> >>> new feature) to let the libvirt to read it. Or something
> >>> else.
> >>
> >> There's the issue of APIs that tweak features as Maxime
> >> suggested.
> > 
> > Yes, it's a good point.
> > 
> >> Maybe the only thing to do is to deprecate it,
> > 
> > Looks like so.
> > 
> >> but I feel some way for application to pass info into
> >> guest might be benefitial.
> > 
> > The two APIs are just for tweaking feature bits DPDK supports before
> > any device got connected. It's another way to disable some features
> > (the another obvious way is to through QEMU command lines).
> > 
> > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> > of disabling something though qemu one by one, we could disable it
> > once in DPDK.
> > 
> > But I doubt the useful of it. It's only used in DPDK's vhost example
> > after all. Nor is it used in vhost pmd, neither is it used in OVS.
> 
> rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

Hmmm. I must have checked very old code ...
> 
> netdev_dpdk_vhost_class_init(void)
> {
> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> 
> /* This function can be called for different classes.  The
> initialization
>  * needs to be done only once */
> if (ovsthread_once_start(&once)) {
> rte_vhost_driver_callback_register(&virtio_net_device_ops);
> rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>   | 1ULL << VIRTIO_NET_F_CSUM);

I saw the commit introduced such change, but it tells no reason why
it was added.

commit 362ca39639ae871806be5ae97d55e1cbb14afd92
Author: mweglicx 
Date:   Thu Apr 14 17:40:06 2016 +0100

Update relevant artifacts to add support for DPDK 16.04.

Following changes are applied:
 - INSTALL.DPDK.md: CONFIG_RTE_BUILD_COMBINE_LIBS step has been
   removed because it is no longer present in DPDK configuration
   (combined library is created by default),
 - INSTALL.DPDK.md: VHost Cuse configuration is updated,
 - netdev-dpdk.c: Link speed definition is changed in DPDK and
   netdev_dpdk_get_features is updated accordingly,
 - netdev-dpdk.c: TSO and checksum offload has been disabled for
   vhostuser device.
 - .travis/linux-build.sh: DPDK version is updated and legacy
   flags have been removed in configuration.

Signed-off-by: Michal Weglicki 
Signed-off-by: Panu Matilainen 
Acked-by: Daniele Di Proietto 

--yliu

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-11-23 Thread Yuanhan Liu

On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> > > You keep assuming that you have the VM started first and
> > > figure out things afterwards, but this does not work.
> > > 
> > > Think about a cluster of machines. You want to start a VM in
> > > a way that will ensure compatibility with all hosts
> > > in a cluster.
> > 
> > I see. I was more considering about the case when the dst
> > host (including the qemu and dpdk combo) is given, and
> > then determine whether it will be a successfull migration
> > or not.
> > 
> > And you are asking that we need to know which host could
> > be a good candidate before starting the migration. In such
> > case, we indeed need some inputs from both the qemu and
> > vhost-user backend.
> > 
> > For DPDK, I think it could be simple, just as you said, it
> > could be either a tiny script, or even a macro defined in
> > the source code file (we extend it every time we add a
> > new feature) to let the libvirt to read it. Or something
> > else.
> 
> There's the issue of APIs that tweak features as Maxime
> suggested.

Yes, it's a good point.

> Maybe the only thing to do is to deprecate it,

Looks like so.

> but I feel some way for application to pass info into
> guest might be benefitial.

The two APIs are just for tweaking feature bits DPDK supports before
any device got connected. It's another way to disable some features
(the another obvious way is to through QEMU command lines).

IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
of disabling something though qemu one by one, we could disable it
once in DPDK.

But I doubt the useful of it. It's only used in DPDK's vhost example
after all. Nor is it used in vhost pmd, neither is it used in OVS.

> > > If you don't, guest visible interface will change
> > > and you won't be able to migrate.
> > > 
> > > It does not make sense to discuss feature bits specifically
> > > since that is not the only part of interface.
> > > For example, max ring size supported might change.
> > 
> > I don't quite understand why we have to consider the max ring
> > size here? Isn't it a virtio device attribute, that QEMU could
> > provide such compatibility information?
> >
> > I mean, DPDK is supposed to support vary vring size, it's QEMU
> > to give a specifc value.
> 
> If backend supports s/g of any size up to 2^16, there's no issue.

I don't know others, but I see no issues in DPDK.

> ATM some backends might be assuming up to 1K s/g since
> QEMU never supported bigger ones. We might classify this
> as a bug, or not and add a feature flag.
> 
> But it's just an example. There might be more values at issue
> in the future.

Yeah, maybe. But we could analysis it one by one.

> > > Let me describe how it works in qemu/libvirt.
> > > When you install a VM, you can specify compatibility
> > > level (aka "machine type"), and you can query the supported compatibility
> > > levels. Management uses that to find the supported compatibility
> > > and stores the compatibility in XML that is migrated with the VM.
> > > There's also a way to find the latest level which is the
> > > default unless overridden by user, again this level
> > > is recorded and then
> > > - management can make sure migration destination is compatible
> > > - management can avoid migration to hosts without that support
> > 
> > Thanks for the info, it helps.
> > 
> > ...
> > > > > >>As version here is an opaque string for libvirt and qemu,
> > > > > >>anything can be used - but I suggest either a list
> > > > > >>of values defining the interface, e.g.
> > > > > >>any_layout=on,max_ring=256
> > > > > >>or a version including the name and vendor of the backend,
> > > > > >>e.g. "org.dpdk.v4.5.6".
> > 
> > The version scheme may not be ideal here. Assume a QEMU is supposed
> > to work with a specific DPDK version, however, user may disable some
> > newer features through qemu command line, that it also could work with
> > an elder DPDK version. Using the version scheme will not allow us doing
> > such migration to an elder DPDK version. The MTU is a lively example
> > here? (when MTU feature is provided by QEMU but is actually disabled
> > by user, that it could also work with an elder DPDK without MTU support).
> > 
> > --yliu
> 
> OK, so does a list of values look better to you then?

Yes, if there are no better way.

And I think it may be better to not list all those features, literally.
But instead, using the number should be better, say, features=0xdeadbeef.

Listing the feature names means we have to come to an agreement in all
components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
backends), that we have to use the exact same feature names. Though it
may not be a big deal, it lacks some flexibility.

A feature bits will not have this issue.

--yliu

> 
> 
> > > > > >>
> > > > > >>Note that typically the list of supported versions can only be
> > > > > >>extended, not shrunk. Also, if the host/guest interf

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-11-22 Thread Yuanhan Liu

On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > >As usaual, sorry for late response :/
> > > >
> > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > >>Hi!
> > > >>So it looks like we face a problem with cross-version
> > > >>migration when using vhost. It's not new but became more
> > > >>acute with the advent of vhost user.
> > > >>
> > > >>For users to be able to migrate between different versions
> > > >>of the hypervisor the interface exposed to guests
> > > >>by hypervisor must stay unchanged.
> > > >>
> > > >>The problem is that a qemu device is connected
> > > >>to a backend in another process, so the interface
> > > >>exposed to guests depends on the capabilities of that
> > > >>process.
> > > >>
> > > >>Specifically, for vhost user interface based on virtio, this includes
> > > >>the "host features" bitmap that defines the interface, as well as more
> > > >>host values such as the max ring size.  Adding new features/changing
> > > >>values to this interface is required to make progress, but on the other
> > > >>hand we need ability to get the old host features to be compatible.
> > > >
> > > >It looks like to the same issue of vhost-user reconnect to me. For 
> > > >example,
> > > >
> > > >- start dpdk 16.07 & qemu 2.5
> > > >- kill dpdk
> > > >- start dpdk 16.11
> > > >
> > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, 
> > > >indirect),
> > > >above should work. Because qemu saves the negotiated features before the
> > > >disconnect and stores it back after the reconnection.
> > > >
> > > >commit a463215b087c41d7ca94e51aa347cde523831873
> > > >Author: Marc-André Lureau 
> > > >Date:   Mon Jun 6 18:45:05 2016 +0200
> > > >
> > > >vhost-net: save & restore vhost-user acked features
> > > >
> > > >The initial vhost-user connection sets the features to be 
> > > > negotiated
> > > >with the driver. Renegotiation isn't possible without device 
> > > > reset.
> > > >
> > > >To handle reconnection of vhost-user backend, ensure the same 
> > > > set of
> > > >features are provided, and reuse already acked features.
> > > >
> > > >Signed-off-by: Marc-André Lureau 
> > > >
> > > >
> > > >So we could do similar to vhost-user? I mean, save the acked features
> > > >before migration and store it back after it. This should be able to
> > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > >be easily detected, and then exit with an error to user: migration
> > > >failed due to un-compatible vhost features.
> > > >
> > > >Just some rough thoughts. Makes tiny sense?
> > > 
> > > My understanding is that the management tool has to know whether
> > > versions are compatible before initiating the migration:
> > 
> > Makes sense. How about getting and restoring the acked features through
> > qemu command lines then, say, through the monitor interface?
> > 
> > With that, it would be something like:
> > 
> > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > 
> > - read the acked features (through monitor interface)
> > 
> > - start vhost-user backend in the dst host
> > 
> > - start qemu in the dst host with the just queried acked features
> > 
> >   QEMU then is expected to use this feature set for the later vhost-user
> >   feature negotitation. Exit if features compatibility is broken.
> > 
> > Thoughts?
> > 
> > --yliu
> 
> 
> You keep assuming that you have the VM started first and
> figure out things afterwards, but this does not work.
> 
> Think about a cluster of machines. You want to start a VM in
> a way that will ensure compatibility with all hosts
> in a cluster.

I see. I was more considering about the case when the dst
ho

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-11-17 Thread Yuanhan Liu

On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> >As usaual, sorry for late response :/
> >
> >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> >>Hi!
> >>So it looks like we face a problem with cross-version
> >>migration when using vhost. It's not new but became more
> >>acute with the advent of vhost user.
> >>
> >>For users to be able to migrate between different versions
> >>of the hypervisor the interface exposed to guests
> >>by hypervisor must stay unchanged.
> >>
> >>The problem is that a qemu device is connected
> >>to a backend in another process, so the interface
> >>exposed to guests depends on the capabilities of that
> >>process.
> >>
> >>Specifically, for vhost user interface based on virtio, this includes
> >>the "host features" bitmap that defines the interface, as well as more
> >>host values such as the max ring size.  Adding new features/changing
> >>values to this interface is required to make progress, but on the other
> >>hand we need ability to get the old host features to be compatible.
> >
> >It looks like to the same issue of vhost-user reconnect to me. For example,
> >
> >- start dpdk 16.07 & qemu 2.5
> >- kill dpdk
> >- start dpdk 16.11
> >
> >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
> >above should work. Because qemu saves the negotiated features before the
> >disconnect and stores it back after the reconnection.
> >
> >commit a463215b087c41d7ca94e51aa347cde523831873
> >Author: Marc-André Lureau 
> >Date:   Mon Jun 6 18:45:05 2016 +0200
> >
> >vhost-net: save & restore vhost-user acked features
> >
> >The initial vhost-user connection sets the features to be negotiated
> >with the driver. Renegotiation isn't possible without device reset.
> >
> >To handle reconnection of vhost-user backend, ensure the same set of
> >features are provided, and reuse already acked features.
> >
> >Signed-off-by: Marc-André Lureau 
> >
> >
> >So we could do similar to vhost-user? I mean, save the acked features
> >before migration and store it back after it. This should be able to
> >keep the compatibility. If user downgrades DPDK version, it also could
> >be easily detected, and then exit with an error to user: migration
> >failed due to un-compatible vhost features.
> >
> >Just some rough thoughts. Makes tiny sense?
> 
> My understanding is that the management tool has to know whether
> versions are compatible before initiating the migration:

Makes sense. How about getting and restoring the acked features through
qemu command lines then, say, through the monitor interface?

With that, it would be something like:

- start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host

- read the acked features (through monitor interface)

- start vhost-user backend in the dst host

- start qemu in the dst host with the just queried acked features

  QEMU then is expected to use this feature set for the later vhost-user
  feature negotitation. Exit if features compatibility is broken.

Thoughts?

--yliu

>  1. The downtime could be unpredictable if a VM has to move from hosts
> to hosts multiple times, which is problematic, especially for NFV.
>  2. If migration is not possible, maybe the management tool would
> prefer not to interrupt the VM on current host.
> 
> I have little experience with migration though, so I could be mistaken.
> 
> Thanks,
> Maxime
> 
> >
> > --yliu
> >>
> >>To solve this problem within qemu, qemu has a versioning system based on
> >>a machine type concept which fundamentally is a version string, by
> >>specifying that string one can get hardware compatible with a previous
> >>qemu version. QEMU also reports the latest version and list of versions
> >>supported so libvirt records the version at VM creation and then is
> >>careful to use this machine version whenever it migrates a VM.
> >>
> >>One might wonder how is this solved with a kernel vhost backend. The
> >>answer is that it mostly isn't - instead an assumption is made, that
> >>qemu versions are deployed together with the kernel - this is generally
> >>true for downstreams.  Thus whenever qemu gains a new feature, it is
> >>already supported by the kernel as well.  However, if one attempts
> >>migration wit

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-11-17 Thread Yuanhan Liu

As usaual, sorry for late response :/

On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> Hi!
> So it looks like we face a problem with cross-version
> migration when using vhost. It's not new but became more
> acute with the advent of vhost user.
> 
> For users to be able to migrate between different versions
> of the hypervisor the interface exposed to guests
> by hypervisor must stay unchanged.
> 
> The problem is that a qemu device is connected
> to a backend in another process, so the interface
> exposed to guests depends on the capabilities of that
> process.
> 
> Specifically, for vhost user interface based on virtio, this includes
> the "host features" bitmap that defines the interface, as well as more
> host values such as the max ring size.  Adding new features/changing
> values to this interface is required to make progress, but on the other
> hand we need ability to get the old host features to be compatible.

It looks like to the same issue of vhost-user reconnect to me. For example,

- start dpdk 16.07 & qemu 2.5
- kill dpdk
- start dpdk 16.11

Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, indirect),
above should work. Because qemu saves the negotiated features before the
disconnect and stores it back after the reconnection.

commit a463215b087c41d7ca94e51aa347cde523831873
Author: Marc-André Lureau 
Date:   Mon Jun 6 18:45:05 2016 +0200

vhost-net: save & restore vhost-user acked features

The initial vhost-user connection sets the features to be negotiated
with the driver. Renegotiation isn't possible without device reset.

To handle reconnection of vhost-user backend, ensure the same set of
features are provided, and reuse already acked features.

Signed-off-by: Marc-André Lureau 


So we could do similar to vhost-user? I mean, save the acked features
before migration and store it back after it. This should be able to
keep the compatibility. If user downgrades DPDK version, it also could
be easily detected, and then exit with an error to user: migration
failed due to un-compatible vhost features.

Just some rough thoughts. Makes tiny sense?

--yliu
> 
> To solve this problem within qemu, qemu has a versioning system based on
> a machine type concept which fundamentally is a version string, by
> specifying that string one can get hardware compatible with a previous
> qemu version. QEMU also reports the latest version and list of versions
> supported so libvirt records the version at VM creation and then is
> careful to use this machine version whenever it migrates a VM.
> 
> One might wonder how is this solved with a kernel vhost backend. The
> answer is that it mostly isn't - instead an assumption is made, that
> qemu versions are deployed together with the kernel - this is generally
> true for downstreams.  Thus whenever qemu gains a new feature, it is
> already supported by the kernel as well.  However, if one attempts
> migration with a new qemu from a system with a new to old kernel, one
> would get a failure.
> 
> In the world where we have multiple userspace backends, with some of
> these supplied by ISVs, this seems non-realistic.
> 
> IMO we need to support vhost backend versioning, ideally
> in a way that will also work for vhost kernel backends.
> 
> So I'd like to get some input from both backend and management
> developers on what a good solution would look like.
> 
> If we want to emulate the qemu solution, this involves adding the
> concept of interface versions to dpdk.  For example, dpdk could supply a
> file (or utility printing?) with list of versions: latest and versions
> supported. libvirt could read that and
> - store latest version at vm creation
> - pass it around with the vm
> - pass it to qemu
> 
> >From here, qemu could pass this over the vhost-user channel,
> thus making sure it's initialized with the correct
> compatible interface.
> 
> As version here is an opaque string for libvirt and qemu,
> anything can be used - but I suggest either a list
> of values defining the interface, e.g.
> any_layout=on,max_ring=256
> or a version including the name and vendor of the backend,
> e.g. "org.dpdk.v4.5.6".
> 
> Note that typically the list of supported versions can only be
> extended, not shrunk. Also, if the host/guest interface
> does not change, don't change the current version as
> this just creates work for everyone.
> 
> Thoughts? Would this work well for management? dpdk? vpp?
> 
> Thanks!
> 
> -- 
> MST

Re: [Qemu-devel] [dpdk-dev] [PATCH 1/2] vhost: enable any layout feature

2016-10-11 Thread Yuanhan Liu

On Tue, Oct 11, 2016 at 02:57:49PM +0800, Yuanhan Liu wrote:
> > > > > > There was an example: the vhost enqueue optmization patchset from
> > > > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)

Though it doesn't matter now, but I have verified it yesterday (with and
wihtout memset), the drop could be up to 30+%.

This is to let you know that it could behaviour badly if memset is not
inlined.

> > > > > > on my Ivybridge server: it has no such issue on his server though.
> > > > > >
> > > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > > > > >
> > > > > > --yliu
> > > > >
> > > > > I'd say that's weird. what's your config? any chance you
> > > > > are using an old compiler?
> > > > 
> > > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
> > > > he said the memset is not well optimized for Ivybridge server.
> > > 
> > > The dst is remote in that case. It's fine on Haswell but has complication
> > > in Ivy Bridge which (wasn't supposed to but) causes serious frontend 
> > > issue.
> > > 
> > > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1.
> > 
> > 
> > So try something like this then:
> 
> Yes, I saw memset is inlined when this diff is applied.

I have another concern though: It's a trick could let gcc do the inline,
I am not quite sure whether that's ture with other compilers (i.e. clang,
icc, or even, older gcc).

For this case, I think I still prefer some trick like
*(struct ..*) = {0, }

Or even, we may could introduce rte_memset(). IIRC, that has been
proposed somehow before?

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-10 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 07:39:59AM +0300, Michael S. Tsirkin wrote:
> > > > > > 1. why is this memset expensive?
> > > > >
> > > > > I don't have the exact answer, but just some rough thoughts:
> > > > >
> > > > > It's an external clib function: there is a call stack and the
> > > > > IP register will bounch back and forth.
> > > >
> > > > for memset 0?  gcc 5.3.1 on fedora happily inlines it.
> > > 
> > > Good to know!
> > > 
> > > > > overkill to use that for resetting 14 bytes structure.
> > > > >
> > > > > Some trick like
> > > > > *(struct virtio_net_hdr *)hdr = {0, };
> > > > >
> > > > > Or even
> > > > > hdr->xxx = 0;
> > > > > hdr->yyy = 0;
> > > > >
> > > > > should behaviour better.
> > > > >
> > > > > There was an example: the vhost enqueue optmization patchset from
> > > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> > > > > on my Ivybridge server: it has no such issue on his server though.
> > > > >
> > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > > > >
> > > > >   --yliu
> > > >
> > > > I'd say that's weird. what's your config? any chance you
> > > > are using an old compiler?
> > > 
> > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
> > > he said the memset is not well optimized for Ivybridge server.
> > 
> > The dst is remote in that case. It's fine on Haswell but has complication
> > in Ivy Bridge which (wasn't supposed to but) causes serious frontend issue.
> > 
> > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1.
> 
> 
> So try something like this then:

Yes, I saw memset is inlined when this diff is applied.

So, mind to send a formal patch? You might want to try build at least:
it doesn't build.

 
> Generally pointer chasing in vq->hw->vtnet_hdr_size can't be good
> for performance. Move fields used on data path into vq
> and use from there to avoid indirections?

Good suggestion!

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-10 Thread Yuanhan Liu

On Tue, Oct 11, 2016 at 08:39:54AM +0200, Maxime Coquelin wrote:
> 
> 
> On 10/11/2016 08:04 AM, Yuanhan Liu wrote:
> >On Mon, Oct 10, 2016 at 04:54:39PM +0200, Maxime Coquelin wrote:
> >>
> >>
> >>On 10/10/2016 04:42 PM, Yuanhan Liu wrote:
> >>>On Mon, Oct 10, 2016 at 02:40:44PM +0200, Maxime Coquelin wrote:
> >>>>>>>At that time, a packet always use 2 descs. Since indirect desc is
> >>>>>>>enabled (by default) now, the assumption is not true then. What's
> >>>>>>>worse, it might even slow things a bit down. That should also be
> >>>>>>>part of the reason why performance is slightly worse than before.
> >>>>>>>
> >>>>>>>   --yliu
> >>>>>>
> >>>>>>I'm not sure I get what you are saying
> >>>>>>
> >>>>>>>commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> >>>>>>>Author: Yuanhan Liu 
> >>>>>>>Date:   Mon May 2 17:46:17 2016 -0700
> >>>>>>>
> >>>>>>>  vhost: optimize dequeue for small packets
> >>>>>>>
> >>>>>>>  A virtio driver normally uses at least 2 desc buffers for Tx: the
> >>>>>>>  first for storing the header, and the others for storing the data.
> >>>>>>>
> >>>>>>>  Therefore, we could fetch the first data desc buf before the main
> >>>>>>>  loop, and do the copy first before the check of "are we done yet?".
> >>>>>>>  This could save one check for small packets that just have one data
> >>>>>>>  desc buffer and need one mbuf to store it.
> >>>>>>>
> >>>>>>>  Signed-off-by: Yuanhan Liu 
> >>>>>>>  Acked-by: Huawei Xie 
> >>>>>>>  Tested-by: Rich Lane 
> >>>>>>
> >>>>>>This fast-paths the 2-descriptors format but it's not active
> >>>>>>for indirect descriptors. Is this what you mean?
> >>>>>
> >>>>>Yes. It's also not active when ANY_LAYOUT is actually turned on.
> >>>>>>Should be a simple matter to apply this optimization for indirect.
> >>>>>
> >>>>>Might be.
> >>>>
> >>>>If I understand the code correctly, indirect descs also benefit from this
> >>>>optimization, or am I missing something?
> >>>
> >>>Aha..., you are right!
> >>
> >>The interesting thing is that the patch I send on Thursday that removes
> >>header access when no offload has been negotiated[0] seems to reduce
> >>almost to zero the performance seen with indirect descriptors enabled.
> >
> >Didn't follow that.
> >
> >>I see this with 64 bytes packets using testpmd on both ends.
> >>
> >>When I did the patch, I would have expected the same gain with both
> >>modes, whereas I measured +1% for direct and +4% for indirect.
> >
> >IIRC, I did a test before (remove those offload code piece), and the
> >performance was basically the same before and after that. Well, there
> >might be some small difference, say 1% as you said. But the result has
> >never been steady.
> >
> >Anyway, I think your patch is good to have: I just didn't see v2.
> 
> I waited to gather some comments/feedback before sending the v2.
> I'll send it today or tomorrow.

Interesting, I saw a deadlock then: I haven't looked at the code
carefully once you said there is a v2, thus I'm waiting for it.
However, you are waitting for my review. :)

Anyway, I will take time to look at it shortly.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-10 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 04:54:39PM +0200, Maxime Coquelin wrote:
> 
> 
> On 10/10/2016 04:42 PM, Yuanhan Liu wrote:
> >On Mon, Oct 10, 2016 at 02:40:44PM +0200, Maxime Coquelin wrote:
> >>>>>At that time, a packet always use 2 descs. Since indirect desc is
> >>>>>enabled (by default) now, the assumption is not true then. What's
> >>>>>worse, it might even slow things a bit down. That should also be
> >>>>>part of the reason why performance is slightly worse than before.
> >>>>>
> >>>>> --yliu
> >>>>
> >>>>I'm not sure I get what you are saying
> >>>>
> >>>>>commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> >>>>>Author: Yuanhan Liu 
> >>>>>Date:   Mon May 2 17:46:17 2016 -0700
> >>>>>
> >>>>>   vhost: optimize dequeue for small packets
> >>>>>
> >>>>>   A virtio driver normally uses at least 2 desc buffers for Tx: the
> >>>>>   first for storing the header, and the others for storing the data.
> >>>>>
> >>>>>   Therefore, we could fetch the first data desc buf before the main
> >>>>>   loop, and do the copy first before the check of "are we done yet?".
> >>>>>   This could save one check for small packets that just have one data
> >>>>>   desc buffer and need one mbuf to store it.
> >>>>>
> >>>>>   Signed-off-by: Yuanhan Liu 
> >>>>>   Acked-by: Huawei Xie 
> >>>>>   Tested-by: Rich Lane 
> >>>>
> >>>>This fast-paths the 2-descriptors format but it's not active
> >>>>for indirect descriptors. Is this what you mean?
> >>>
> >>>Yes. It's also not active when ANY_LAYOUT is actually turned on.
> >>>>Should be a simple matter to apply this optimization for indirect.
> >>>
> >>>Might be.
> >>
> >>If I understand the code correctly, indirect descs also benefit from this
> >>optimization, or am I missing something?
> >
> >Aha..., you are right!
> 
> The interesting thing is that the patch I send on Thursday that removes
> header access when no offload has been negotiated[0] seems to reduce
> almost to zero the performance seen with indirect descriptors enabled.

Didn't follow that.

> I see this with 64 bytes packets using testpmd on both ends.
> 
> When I did the patch, I would have expected the same gain with both
> modes, whereas I measured +1% for direct and +4% for indirect.

IIRC, I did a test before (remove those offload code piece), and the
performance was basically the same before and after that. Well, there
might be some small difference, say 1% as you said. But the result has
never been steady.

Anyway, I think your patch is good to have: I just didn't see v2.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-10 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 02:40:44PM +0200, Maxime Coquelin wrote:
> >>>At that time, a packet always use 2 descs. Since indirect desc is
> >>>enabled (by default) now, the assumption is not true then. What's
> >>>worse, it might even slow things a bit down. That should also be
> >>>part of the reason why performance is slightly worse than before.
> >>>
> >>>   --yliu
> >>
> >>I'm not sure I get what you are saying
> >>
> >>>commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> >>>Author: Yuanhan Liu 
> >>>Date:   Mon May 2 17:46:17 2016 -0700
> >>>
> >>>vhost: optimize dequeue for small packets
> >>>
> >>>A virtio driver normally uses at least 2 desc buffers for Tx: the
> >>>first for storing the header, and the others for storing the data.
> >>>
> >>>Therefore, we could fetch the first data desc buf before the main
> >>>    loop, and do the copy first before the check of "are we done yet?".
> >>>This could save one check for small packets that just have one data
> >>>desc buffer and need one mbuf to store it.
> >>>
> >>>Signed-off-by: Yuanhan Liu 
> >>>Acked-by: Huawei Xie 
> >>>Tested-by: Rich Lane 
> >>
> >>This fast-paths the 2-descriptors format but it's not active
> >>for indirect descriptors. Is this what you mean?
> >
> >Yes. It's also not active when ANY_LAYOUT is actually turned on.
> >>Should be a simple matter to apply this optimization for indirect.
> >
> >Might be.
> 
> If I understand the code correctly, indirect descs also benefit from this
> optimization, or am I missing something?

Aha..., you are right!

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 07:17:06AM +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 10, 2016 at 12:05:31PM +0800, Yuanhan Liu wrote:
> > On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > > > And the same is done is done in DPDK:
> > > > > 
> > > > > static inline int __attribute__((always_inline))
> > > > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > > > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > > > >   struct rte_mempool *mbuf_pool)
> > > > > {
> > > > > ...
> > > > > /*
> > > > >  * A virtio driver normally uses at least 2 desc buffers
> > > > >  * for Tx: the first for storing the header, and others
> > > > >  * for storing the data.
> > > > >  */
> > > > > if (likely((desc->len == dev->vhost_hlen) &&
> > > > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > > > desc = &descs[desc->next];
> > > > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > > > return -1;
> > > > > 
> > > > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > > > if (unlikely(!desc_addr))
> > > > > return -1;
> > > > > 
> > > > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > > > 
> > > > > desc_offset = 0;
> > > > > desc_avail  = desc->len;
> > > > > nr_desc+= 1;
> > > > > 
> > > > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > > > } else {
> > > > > desc_avail  = desc->len - dev->vhost_hlen;
> > > > > desc_offset = dev->vhost_hlen;
> > > > > }
> > > > 
> > > > Actually, the header is parsed in DPDK vhost implementation.
> > > > But as Virtio PMD provides a zero'ed header, we could just parse
> > > > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> > > 
> > > host can always skip the header parse if it wants to.
> > > It didn't seem worth it to add branches there but
> > > if I'm wrong, by all means code it up.
> > 
> > It's added by following commit, which yields about 10% performance
> > boosts for PVP case (with 64B packet size).
> > 
> > At that time, a packet always use 2 descs. Since indirect desc is
> > enabled (by default) now, the assumption is not true then. What's
> > worse, it might even slow things a bit down. That should also be
> > part of the reason why performance is slightly worse than before.
> > 
> > --yliu
> 
> I'm not sure I get what you are saying
> 
> > commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> > Author: Yuanhan Liu 
> > Date:   Mon May 2 17:46:17 2016 -0700
> > 
> > vhost: optimize dequeue for small packets
> > 
> > A virtio driver normally uses at least 2 desc buffers for Tx: the
> > first for storing the header, and the others for storing the data.
> > 
> > Therefore, we could fetch the first data desc buf before the main
> > loop, and do the copy first before the check of "are we done yet?".
> > This could save one check for small packets that just have one data
> > desc buffer and need one mbuf to store it.
> > 
> > Signed-off-by: Yuanhan Liu 
> > Acked-by: Huawei Xie 
> > Tested-by: Rich Lane 
> 
> This fast-paths the 2-descriptors format but it's not active
> for indirect descriptors. Is this what you mean?

Yes. It's also not active when ANY_LAYOUT is actually turned on.

> Should be a simple matter to apply this optimization for indirect.

Might be.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > And the same is done is done in DPDK:
> > > 
> > > static inline int __attribute__((always_inline))
> > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > >   struct rte_mempool *mbuf_pool)
> > > {
> > > ...
> > > /*
> > >  * A virtio driver normally uses at least 2 desc buffers
> > >  * for Tx: the first for storing the header, and others
> > >  * for storing the data.
> > >  */
> > > if (likely((desc->len == dev->vhost_hlen) &&
> > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > desc = &descs[desc->next];
> > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > return -1;
> > > 
> > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > if (unlikely(!desc_addr))
> > > return -1;
> > > 
> > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > 
> > > desc_offset = 0;
> > > desc_avail  = desc->len;
> > > nr_desc+= 1;
> > > 
> > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > } else {
> > > desc_avail  = desc->len - dev->vhost_hlen;
> > > desc_offset = dev->vhost_hlen;
> > > }
> > 
> > Actually, the header is parsed in DPDK vhost implementation.
> > But as Virtio PMD provides a zero'ed header, we could just parse
> > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> 
> host can always skip the header parse if it wants to.
> It didn't seem worth it to add branches there but
> if I'm wrong, by all means code it up.

It's added by following commit, which yields about 10% performance
boosts for PVP case (with 64B packet size).

At that time, a packet always use 2 descs. Since indirect desc is
enabled (by default) now, the assumption is not true then. What's
worse, it might even slow things a bit down. That should also be
part of the reason why performance is slightly worse than before.

--yliu

commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
Author: Yuanhan Liu 
Date:   Mon May 2 17:46:17 2016 -0700

vhost: optimize dequeue for small packets

    A virtio driver normally uses at least 2 desc buffers for Tx: the
first for storing the header, and the others for storing the data.

Therefore, we could fetch the first data desc buf before the main
loop, and do the copy first before the check of "are we done yet?".
This could save one check for small packets that just have one data
desc buffer and need one mbuf to store it.

Signed-off-by: Yuanhan Liu 
Acked-by: Huawei Xie 
Tested-by: Rich Lane

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> >so doing this unconditionally would be a spec violation, but if you see
> >value in this, we can add a feature bit.
> Right it would be a spec violation, so it should be done conditionally.
> If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER?
> It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set.
> If negotiated, we wouldn't need to prepend a header.

If we could skip Tx header, I think we could also skip Rx header, in the
case when mrg-rx is aslo turned off?

> From the micro-benchmarks results, we can expect +10% compared to
> indirect descriptors, and + 5% compared to using 2 descs in the
> virtqueue.
> Also, it should have the same benefits as indirect descriptors for 0%
> pkt loss (as we can fill 2x more packets in the virtqueue).
> 
> What do you think?

I would vote for this. It should yield maximum performance for the case
that it's guaranteed that packet size will always fit in a typical MTU
(1500).

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 06:46:44AM +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> > On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > > Yes but two points.
> > > 
> > > 1. why is this memset expensive?
> > 
> > I don't have the exact answer, but just some rough thoughts:
> > 
> > It's an external clib function: there is a call stack and the
> > IP register will bounch back and forth.
> 
> for memset 0?  gcc 5.3.1 on fedora happily inlines it.

Good to know!

> > overkill to use that for resetting 14 bytes structure.
> > 
> > Some trick like
> > *(struct virtio_net_hdr *)hdr = {0, };
> > 
> > Or even 
> > hdr->xxx = 0;
> > hdr->yyy = 0;
> > 
> > should behaviour better.
> > 
> > There was an example: the vhost enqueue optmization patchset from
> > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> > on my Ivybridge server: it has no such issue on his server though.
> > 
> > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > 
> > --yliu
> 
> I'd say that's weird. what's your config? any chance you
> are using an old compiler?

Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
he said the memset is not well optimized for Ivybridge server.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 06:04:32AM +0300, Michael S. Tsirkin wrote:
> > > So I guess at this point, we can teach vhost-user in qemu
> > > that version 1 implies any_layout but only for machine types
> > > qemu 2.8 and up. It sets a bad precedent but oh well.
> > 
> > It should work.
> > 
> > --yliu
> 
> Cool. Want to post a patch?

I could have a try this week (Well, it's very unlikely though).
If not, it will be postponed for a while: I am traveling next week.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Mon, Oct 10, 2016 at 02:20:22AM +0300, Michael S. Tsirkin wrote:
> On Wed, Sep 28, 2016 at 10:28:48AM +0800, Yuanhan Liu wrote:
> > On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote:
> > > > On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
> > > > > > I assume that if using Version 1 that the bit will be ignored
> > > > 
> > > > Yes, but I will just quote what you just said: what if the guest
> > > > virtio device is a legacy device? I also gave my reasons in another
> > > > email why I consistently set this flag:
> > > > 
> > > >   - we have to return all features we support to the guest.
> > > >   
> > > > We don't know the guest is a modern or legacy device. That means
> > > > we should claim we support both: VERSION_1 and ANY_LAYOUT.
> > > >   
> > > > Assume guest is a legacy device and we just set VERSION_1 (the 
> > > > current
> > > > case), ANY_LAYOUT will never be negotiated.
> > > >   
> > > >   - I'm following the way Linux kernel takes: it also set both features.
> > > >   
> > > >   Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?
> > > > 
> > > > The unset after negotiation I proposed turned out it won't work: the
> > > > feature is already negotiated; unsetting it only in vhost side doesn't
> > > > change anything. Besides, it may break the migration as Michael stated
> > > > below.
> > > 
> > > I think the reverse. Teach vhost user that for future machine types
> > > only VERSION_1 implies ANY_LAYOUT.
> 
> So I guess at this point, we can teach vhost-user in qemu
> that version 1 implies any_layout but only for machine types
> qemu 2.8 and up. It sets a bad precedent but oh well.

It should work.

--yliu

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu

On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > 
> > 
> > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> Yes but two points.
> 
> 1. why is this memset expensive?

I don't have the exact answer, but just some rough thoughts:

It's an external clib function: there is a call stack and the
IP register will bounch back and forth. BTW, It's kind of an
overkill to use that for resetting 14 bytes structure.

Some trick like
*(struct virtio_net_hdr *)hdr = {0, };

Or even 
hdr->xxx = 0;
hdr->yyy = 0;

should behaviour better.

There was an example: the vhost enqueue optmization patchset from
Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
on my Ivybridge server: it has no such issue on his server though.

[0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html

--yliu

> Is the test completely skipping looking
>at the packet otherwise?
> 
> 2. As long as we are doing this, see
>   Alignment vs. Networking
>   
> in Documentation/unaligned-memory-access.txt
> 
> 
> > From the micro-benchmarks results, we can expect +10% compared to
> > indirect descriptors, and + 5% compared to using 2 descs in the
> > virtqueue.
> > Also, it should have the same benefits as indirect descriptors for 0%
> > pkt loss (as we can fill 2x more packets in the virtqueue).
> > 
> > What do you think?
> > 
> > Thanks,
> > Maxime

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-09-27 Thread Yuanhan Liu

On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote:
> On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote:
> > On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
> > > > I assume that if using Version 1 that the bit will be ignored
> > 
> > Yes, but I will just quote what you just said: what if the guest
> > virtio device is a legacy device? I also gave my reasons in another
> > email why I consistently set this flag:
> > 
> >   - we have to return all features we support to the guest.
> >   
> > We don't know the guest is a modern or legacy device. That means
> > we should claim we support both: VERSION_1 and ANY_LAYOUT.
> >   
> > Assume guest is a legacy device and we just set VERSION_1 (the current
> > case), ANY_LAYOUT will never be negotiated.
> >   
> >   - I'm following the way Linux kernel takes: it also set both features.
> >   
> >   Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?
> > 
> > The unset after negotiation I proposed turned out it won't work: the
> > feature is already negotiated; unsetting it only in vhost side doesn't
> > change anything. Besides, it may break the migration as Michael stated
> > below.
> 
> I think the reverse. Teach vhost user that for future machine types
> only VERSION_1 implies ANY_LAYOUT.
> 
> 
> > > Therein lies a problem. If dpdk tweaks flags, updating it
> > > will break guest migration.
> > > 
> > > One way is to require that users specify all flags fully when
> > > creating the virtio net device. 
> > 
> > Like how? By a new command line option? And user has to type
> > all those features?
> 
> Make libvirt do this.  users use management normally. those that don't
> likely don't migrate VMs.

Fair enough.

> 
> > > QEMU could verify that all required
> > > flags are set, and fail init if not.
> > > 
> > > This has other advantages, e.g. it adds ability to
> > > init device without waiting for dpdk to connect.

Will the feature negotiation between DPDK and QEMU still exist
in your proposal?

> > > 
> > > However, enabling each new feature would now require
> > > management work. How about dpdk ships the list
> > > of supported features instead?
> > > Management tools could read them on source and destination
> > > and select features supported on both sides.
> > 
> > That means the management tool would somehow has a dependency on
> > DPDK project, which I have no objection at all. But, is that
> > a good idea?
> 
> It already starts the bridge somehow, does it not?

Indeed. I was firstly thinking about reading the dpdk source file
to determine the DPDK supported feature list, with which the bind
is too tight. I later realized you may ask DPDK to provide a binary
to dump the list, or something like that.

> 
> > BTW, I'm not quite sure I followed your idea. I mean, how it supposed
> > to fix the ANY_LAYOUT issue here? How this flag will be set for
> > legacy device?
> > 
> > --yliu
> 
> For ANY_LAYOUT, I think we should just set in in qemu,
> but only for new machine types.

What do you mean by "new machine types"? Virtio device with newer
virtio-spec version?

> This addresses migration
> concerns.

To make sure I followed you, do you mean the migration issue from
an older "dpdk + qemu" combo to a newer "dpdk + qemu" combo (that
more new features might be shipped)?

Besides that, your proposal looks like a big work to accomplish.
Are you okay to make it simple first: set it consistently like
what Linux kernel does? This would at least make the ANY_LAYOUT
actually be enabled for legacy device (which is also the default
one that's widely used so far).

--yliu

> 
> But there will be more new features in the future and
> it is necessary to think how we will enable them without
> breaking migration.
> 
> -- 
> MST

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-09-26 Thread Yuanhan Liu

On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
> On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
> > I assume that if using Version 1 that the bit will be ignored

Yes, but I will just quote what you just said: what if the guest
virtio device is a legacy device? I also gave my reasons in another
email why I consistently set this flag:

  - we have to return all features we support to the guest.

We don't know the guest is a modern or legacy device. That means
we should claim we support both: VERSION_1 and ANY_LAYOUT.

Assume guest is a legacy device and we just set VERSION_1 (the current
case), ANY_LAYOUT will never be negotiated.

  - I'm following the way Linux kernel takes: it also set both features.

  Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?

The unset after negotiation I proposed turned out it won't work: the
feature is already negotiated; unsetting it only in vhost side doesn't
change anything. Besides, it may break the migration as Michael stated
below.

> Therein lies a problem. If dpdk tweaks flags, updating it
> will break guest migration.
> 
> One way is to require that users specify all flags fully when
> creating the virtio net device. 

Like how? By a new command line option? And user has to type
all those features?

> QEMU could verify that all required
> flags are set, and fail init if not.
> 
> This has other advantages, e.g. it adds ability to
> init device without waiting for dpdk to connect.
> 
> However, enabling each new feature would now require
> management work. How about dpdk ships the list
> of supported features instead?
> Management tools could read them on source and destination
> and select features supported on both sides.

That means the management tool would somehow has a dependency on
DPDK project, which I have no objection at all. But, is that
a good idea?

BTW, I'm not quite sure I followed your idea. I mean, how it supposed
to fix the ANY_LAYOUT issue here? How this flag will be set for
legacy device?

--yliu

Re: [Qemu-devel] [PATCH v2 22/23] vhost-user: wait until backend init is completed

2016-06-30 Thread Yuanhan Liu

On Thu, Jun 30, 2016 at 05:22:43AM -0400, Marc-André Lureau wrote:
> Hi
> 
> - Original Message -
> > On Fri, Jun 24, 2016 at 03:51:09PM +0200, marcandre.lur...@redhat.com wrote:
> > > From: Marc-André Lureau 
> > > 
> > > The chardev waits for an initial connection before starting qemu,
> > > vhost-user wants the backend negotiation to be completed. vhost-user is
> > > started in the net_vhost_user_event callback, which is synchronously
> > > called after the socket is connected.
> > > 
> > > Signed-off-by: Marc-André Lureau 
> > > ---
> > >  net/vhost-user.c | 12 +++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/vhost-user.c b/net/vhost-user.c
> > > index 95ed2d2..4badd9e 100644
> > > --- a/net/vhost-user.c
> > > +++ b/net/vhost-user.c
> > > @@ -24,6 +24,7 @@ typedef struct VhostUserState {
> > >  VHostNetState *vhost_net;
> > >  int watch;
> > >  uint64_t acked_features;
> > > +bool started;
> > >  } VhostUserState;
> > >  
> > >  typedef struct VhostUserChardevProps {
> > > @@ -211,6 +212,7 @@ static void net_vhost_user_event(void *opaque, int
> > > event)
> > >  return;
> > >  }
> > >  qmp_set_link(name, true, &err);
> > > +s->started = true;
> > >  break;
> > >  case CHR_EVENT_CLOSED:
> > >  qmp_set_link(name, false, &err);
> > > @@ -248,7 +250,15 @@ static int net_vhost_user_init(NetClientState *peer,
> > > const char *device,
> > >  s->chr = chr;
> > >  }
> > >  
> > > -qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event,
> > > nc[0].name);
> > > +do {
> > > +Error *err = NULL;
> > > +if (qemu_chr_wait_connected(chr, &err) < 0) {
> > > +error_report_err(err);
> > > +return -1;
> > > +}
> > > +qemu_chr_add_handlers(chr, NULL, NULL,
> > > +  net_vhost_user_event, nc[0].name);
> > > +} while (!s->started);
> > 
> > 
> > I haven't looked at your patchset carefully yet, but I did a quick test,
> > and showed that above code piece just breaks vhost-user: it's a dead loop
> > that vhost-user net will be initiated again and again.
> 
> Interesting, thanks a lot for testing!
> 
> The vhost-user-test works just fine, as well as vhost-user-bridge.
> 
> > The dead loop is due to, we check "s->started" corresponding to last nc,
> > while we set "s->started" corresponding to the first nc.
> 
> I see, that makes sense with multiqueue, I'll update the code. It would be 
> really nice to have a basic multiqueue test in vhost-user-test.

My bad. It was actually my task to add the multiqueue test: it was
postponed, or even forgotten :(. If you don't mind, feel free to
add it. If you have not time for that, I will spend some time to
fix what I left.

--yliu

Re: [Qemu-devel] [PATCH v2 22/23] vhost-user: wait until backend init is completed

2016-06-29 Thread Yuanhan Liu

On Fri, Jun 24, 2016 at 03:51:09PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> The chardev waits for an initial connection before starting qemu,
> vhost-user wants the backend negotiation to be completed. vhost-user is
> started in the net_vhost_user_event callback, which is synchronously
> called after the socket is connected.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  net/vhost-user.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 95ed2d2..4badd9e 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -24,6 +24,7 @@ typedef struct VhostUserState {
>  VHostNetState *vhost_net;
>  int watch;
>  uint64_t acked_features;
> +bool started;
>  } VhostUserState;
>  
>  typedef struct VhostUserChardevProps {
> @@ -211,6 +212,7 @@ static void net_vhost_user_event(void *opaque, int event)
>  return;
>  }
>  qmp_set_link(name, true, &err);
> +s->started = true;
>  break;
>  case CHR_EVENT_CLOSED:
>  qmp_set_link(name, false, &err);
> @@ -248,7 +250,15 @@ static int net_vhost_user_init(NetClientState *peer, 
> const char *device,
>  s->chr = chr;
>  }
>  
> -qemu_chr_add_handlers(chr, NULL, NULL, net_vhost_user_event, nc[0].name);
> +do {
> +Error *err = NULL;
> +if (qemu_chr_wait_connected(chr, &err) < 0) {
> +error_report_err(err);
> +return -1;
> +}
> +qemu_chr_add_handlers(chr, NULL, NULL,
> +  net_vhost_user_event, nc[0].name);
> +} while (!s->started);


I haven't looked at your patchset carefully yet, but I did a quick test,
and showed that above code piece just breaks vhost-user: it's a dead loop
that vhost-user net will be initiated again and again.

The dead loop is due to, we check "s->started" corresponding to last nc,
while we set "s->started" corresponding to the first nc.

--yliu

Re: [Qemu-devel] [PATCH 07/10] vhost-net: save & restore vhost-user acked features

2016-06-06 Thread Yuanhan Liu

On Tue, May 10, 2016 at 06:03:57PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> The initial vhost-user connection sets the features to be negotiated
> with the driver. Renegotiation isn't possible without device reset.
> 
> To handle reconnection of vhost-user backend, ensure the same set of
> features are provided, and reuse already acked features.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  hw/net/vhost_net.c   | 21 -
>  include/net/vhost-user.h |  1 +
>  include/net/vhost_net.h  |  3 +++
>  net/vhost-user.c | 10 ++
>  4 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 805a0df..f3df18c 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -120,6 +120,11 @@ uint64_t vhost_net_get_max_queues(VHostNetState *net)
>  return net->dev.max_queues;
>  }
>  
> +uint64_t vhost_net_get_acked_features(VHostNetState *net)
> +{
> +return net->dev.acked_features;
> +}

Note that you need add a dummy implementation for !VHOST_NET. Otherwise,
build won't work.

And sorry for being noisy, here is another ping about the status of
this patch set.

--yliu

Re: [Qemu-devel] [PATCH 00/10] RFCv3: vhost-user: simple reconnection support

2016-05-23 Thread Yuanhan Liu

On Sun, May 15, 2016 at 12:42:12PM +0300, Michael S. Tsirkin wrote:
> > FYI, I have a follow up series (~20 patches,
> > https://github.com/elmarco/qemu/tree/vhost-user-reconnect) doing
> > mostly cleanups and extra checks for disconnection at run time. In
> > particular, it should avoid some obvious crashers/asserts, and
> > prevents qemu from running as long the initial vhost_user_start()
> > didn't succeed (so initial flags are set). I would like to know how to
> > proceed with the follow-up: should I resend the whole series or should
> > we review/merge this rfc first (even though it is known to be
> > incomplete in many disconnect cases that the follow up fixes).
> > 
> > thanks
> 
> I think a gradual merge is better.

+1

May I ask what's the plan to merge this patch set? DPDK has some
dependency on this patch set.

--yliu

Re: [Qemu-devel] [PATCH 02/10] vubr: add client mode

2016-05-10 Thread Yuanhan Liu

Hi,

Just to let you know that as a client, it need to add the reconnect
ability, as the QEMU (as the server) may restart as well. Well, I'm
thinking that you may think it's an example bridge only, so, let it
be simple. Then I'm sorry for the noise.

--yliu

On Tue, May 10, 2016 at 06:03:52PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> If -c is specified, vubr will try to connect to the socket instead of
> listening for connections.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  tests/vhost-user-bridge.c | 38 ++
>  1 file changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
> index 0779ba2..f907ce7 100644
> --- a/tests/vhost-user-bridge.c
> +++ b/tests/vhost-user-bridge.c
> @@ -1204,12 +1204,13 @@ vubr_accept_cb(int sock, void *ctx)
>  }
>  
>  static VubrDev *
> -vubr_new(const char *path)
> +vubr_new(const char *path, bool client)
>  {
>  VubrDev *dev = (VubrDev *) calloc(1, sizeof(VubrDev));
>  dev->nregions = 0;
>  int i;
>  struct sockaddr_un un;
> +CallbackFunc cb;
>  size_t len;
>  
>  for (i = 0; i < MAX_NR_VIRTQUEUE; i++) {
> @@ -1238,19 +1239,27 @@ vubr_new(const char *path)
>  un.sun_family = AF_UNIX;
>  strcpy(un.sun_path, path);
>  len = sizeof(un.sun_family) + strlen(path);
> -unlink(path);
>  
> -if (bind(dev->sock, (struct sockaddr *) &un, len) == -1) {
> -vubr_die("bind");
> -}
> +if (!client) {
> +unlink(path);
> +
> +if (bind(dev->sock, (struct sockaddr *) &un, len) == -1) {
> +vubr_die("bind");
> +}
>  
> -if (listen(dev->sock, 1) == -1) {
> -vubr_die("listen");
> +if (listen(dev->sock, 1) == -1) {
> +vubr_die("listen");
> +}
> +cb = vubr_accept_cb;
> +} else {
> +if (connect(dev->sock, (struct sockaddr *)&un, len) == -1) {
> +vubr_die("connect");
> +}
> +cb = vubr_receive_cb;
>  }
>  
>  dispatcher_init(&dev->dispatcher);
> -dispatcher_add(&dev->dispatcher, dev->sock, (void *)dev,
> -   vubr_accept_cb);
> +dispatcher_add(&dev->dispatcher, dev->sock, (void *)dev, cb);
>  
>  DPRINT("Waiting for connections on UNIX socket %s ...\n", path);
>  return dev;
> @@ -1369,8 +1378,9 @@ main(int argc, char *argv[])
>  {
>  VubrDev *dev;
>  int opt;
> +bool client = false;
>  
> -while ((opt = getopt(argc, argv, "l:r:u:")) != -1) {
> +while ((opt = getopt(argc, argv, "l:r:u:c")) != -1) {
>  
>  switch (opt) {
>  case 'l':
> @@ -1386,16 +1396,20 @@ main(int argc, char *argv[])
>  case 'u':
>  ud_socket_path = strdup(optarg);
>  break;
> +case 'c':
> +client = true;
> +break;
>  default:
>  goto out;
>  }
>  }
>  
> -DPRINT("ud socket: %s\n", ud_socket_path);
> +DPRINT("ud socket: %s (%s)\n", ud_socket_path,
> +   client ? "client" : "server");
>  DPRINT("local: %s:%s\n", lhost, lport);
>  DPRINT("remote:%s:%s\n", rhost, rport);
>  
> -dev = vubr_new(ud_socket_path);
> +dev = vubr_new(ud_socket_path, client);
>  if (!dev) {
>  return 1;
>  }
> -- 
> 2.7.4

Re: [Qemu-devel] [PATCH 00/10] RFCv3: vhost-user: simple reconnection support

2016-05-10 Thread Yuanhan Liu

Hi Marc,

I was also thinking we should do it in this way first. That simplfies
things. So, feel free to add:

Tested-by: Yuanhan Liu 
Reviewed-by: Yuanhan Liu 

(well, I didn't review the test case carefully).

Thanks.

--yliu

On Tue, May 10, 2016 at 06:03:50PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Hi,
> 
> In a previous series "RFCv2: vhost-user: shutdown and reconnection", I
> proposed to add a new slave request to handle graceful shutdown, for
> both qemu configuration, server or client, while keeping the guest
> running with link down status.
> 
> However, for the simple case where qemu is configured as server, and
> the backend processes packets in order and disconnects, it is possible
> for the backend to recover after reconnection by discarding the qemu
> SET_VRING_BASE value and resuming from the used->index instead. This
> simplifies the reconnection support in this particular situation (it
> would make sense to have the backend declare this behaviour with an
> extra flag)
> 
> The guest won't be notified of link status change and queues may not
> be processed in a timely manner, also qemu may assert if some
> vhost-user commands are happening while the backend is disconnected.
> So the reconnection must happen "quickly enough" here. In order to
> keep the series relatively small, these further problems will be
> addressed later.
> 
> These series demonstrates a simple reconnect support for vubr
> (configured as client and qemu as server), includes some nice to
> have fixes and a simple test.
> 
> Marc-André Lureau (8):
>   vubr: add client mode
>   vubr: workaround stale vring base
>   vhost-user: disconnect on start failure
>   vhost-net: do not crash if backend is not present
>   vhost-net: save & restore vhost-user acked features
>   vhost-net: save & restore vring enable state
>   tests: append i386 tests
>   test: start vhost-user reconnect test
> 
> Tetsuya Mukawa (2):
>   vhost-user: add ability to know vhost-user backend disconnection
>   qemu-char: add qemu_chr_disconnect to close a fd accepted by listen fd
> 
>  hw/net/vhost_net.c|  39 ++-
>  include/net/net.h |   1 +
>  include/net/vhost-user.h  |   1 +
>  include/net/vhost_net.h   |   3 ++
>  include/sysemu/char.h |   7 +++
>  net/vhost-user.c  |  32 +++-
>  qemu-char.c   |   8 +++
>  tests/Makefile|   2 +-
>  tests/vhost-user-bridge.c |  45 -
>  tests/vhost-user-test.c   | 123 
> +++---
>  10 files changed, 228 insertions(+), 33 deletions(-)
> 
> -- 
> 2.7.4

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-05-04 Thread Yuanhan Liu

Hello,

On Wed, May 04, 2016 at 10:13:49PM +0300, Michael S. Tsirkin wrote:
> How do you avoid it?
> 
> > > Management is required to make this robust, auto-reconnect
> > > is handy for people bypassing management.
> > 
> > tbh, I don't like autoreconnect. My previous series didn't include
> > this and assumed the feature would be supported only when qemu is
> > configured to be the server. I added reconnect upon request by users.
> 
> I don't have better solutions so OK I guess.

Yes, it's a request from me :)
Well, there may be few others also requested that.

The reason I had this asking is that, so far, we just have only one
vhost-user frontend, and that is QEMU. But we may have many backends,
I'm aware of 4 at the time writing, including the vubr from QEMU.
While we could do vhost-user client and reconnect implementation
on all backends, it's clear that implementing it in the only backend
(QEMU) introduces more benefits.

OTOH, I could implement DPDK vhost-user as client and try reconnect
there, if that could makes all stuff a bit easier.

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-05-02 Thread Yuanhan Liu

On Mon, May 02, 2016 at 03:04:30PM +0300, Michael S. Tsirkin wrote:
> On Mon, May 02, 2016 at 01:29:18PM +0200, Marc-André Lureau wrote:
> > Hi
> > 
> > On Mon, May 2, 2016 at 12:45 PM, Michael S. Tsirkin  wrote:
> > > 1. Graceful disconnect
> > > - One should be able to do vmstop, disconnect, connect then vm start
> > >   This looks like a nice intermediate step.
> > > - Why would we always assume it's always remote initiating the disconnect?
> > >   Monitor commands for disconnecting seem called for.
> > 
> > Those two solutions involve VM management. This looks more complex to
> > communicate/synchronize vhost-user backend & vm management & qemu. The
> > use case I cover is request from the backend to shutdown,
> 
> Right, but flushing buffers + closing the socket looks like
> a cleaner interface than a ton of messages going back and forth.

I'd agree with Michael on that. It needs more cares when dealing with
two stream buffers: you can't quit backend soon after the shutdown
request, instead, you have to wait for the ACK from QEMU. Making request
from QEMU avoids that.

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-05-02 Thread Yuanhan Liu

On Mon, May 02, 2016 at 01:45:31PM +0300, Michael S. Tsirkin wrote:
> On Sun, May 01, 2016 at 02:04:42PM -0700, Yuanhan Liu wrote:
> > On Sun, May 01, 2016 at 02:37:19PM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Apr 29, 2016 at 10:48:35AM -0700, Yuanhan Liu wrote:
> > > > > But, I
> > > > > would worry first about a backend that crashes that it may corrupt the
> > > > > VM memory too...
> > > > 
> > > > Not quite sure I follow this. But, backend just touches the virtio
> > > > related memory, so it will do no harm to the VM?
> > > 
> > > It crashed so apparently it's not behaving as designed -
> > > how can we be sure it does not harm the VM?
> > 
> > Hi Michael,
> > 
> > What I know so far, a crash might could corrupt the virtio memory in two
> > ways:
> > 
> > - vring_used_elem might be in stale state when we are in the middle of
> >   updating used vring. Say, we might have updated the "id" field, but
> >   leaving "len" untouched.
> > 
> > - vring desc buff might also be in stale state, when we are copying
> >   data into it.
> 
> 
> - a random write into VM memory due to backend bug corrupts it.
> 
> > However, the two issues will not be real issue, as used->idx is not
> > updated in both case. Thefore, those corrupted memory will not be
> > processed by guest. So, no harm I'd say. Or, am I missing anything?
> 
> Why is backend crashing? It shouldn't so maybe this means it's
> memory is corrupt. That is the claim.

As far as virtio memory is not corrupted (or even corrupt in above
ways), there would be no issue. But, yes, you made a good point: we
make no guarantees that it's not the virtio memory corruption caused
the crash.

> > BTW, Michael, what's your thoughts on the whole reconnect stuff?
> > 
> > --yliu
> 
> My thoughts are that we need to split these patchsets up.
> 
> There are several issues here:
> 
> 
> 1. Graceful disconnect
> - One should be able to do vmstop, disconnect, connect then vm start
>   This looks like a nice intermediate step.
> - Why would we always assume it's always remote initiating the disconnect?
>   Monitor commands for disconnecting seem called for.

A monitor command is a good suggestion. But I'm thinking why vmstop is
necessary. Basically, a disconnect is as to a cable unplug to physical
NIC; we don't need pause it before unplugging.

> 
> 2. Unexpected disconnect
> - If buffers are processed in order or flushed before socket close,
>   then on disconnect status can be recovered
>   from ring alone. If that is of interest, we might want to add
>   a protocol flag for that. F_DISCONNECT_FLUSH ?

Sorry, what does this flag supposed to work here?

> Without this,
>   only graceful disconnect or reset with guest's help can be supported.
> - As Marc-André said, without graceful shutdown it is not enough to
>   handle ring state though.  We must handle socket getting disconnected
>   in the middle of send/receive.  While it's more work,
>   it does seem like a nice interface to support.

Same as above, what the backend (or QEMU) can do for this case without
the guest's (reset) help?


> - I understand the concern that existing guests do not handle NEED_RESET
>   status. One way to fix this would be a new feature bit. If NEED_RESET not
>   handled,

I could check the code by myself, but I'm thinking it might be trivial
for you to answer my question: how do we know that NEED_RESET is not
handled?

> request hot-unplug instead.

Same as above, may I know how to request a hot-unplug?

> 
> 3. Running while disconnected
> - At the moment, we wait on vm start for remote to connect,
>   if we support disconnecting backend without stopping
>   we probably should also support starting it without waiting
>   for connection

Agreed. I guess that would anaswer my first question: we don't actually
need to do vmstop before disconnect.

--yliu

> - Guests expect tx buffers to be used in a timely manner, thus:
> - If vm is running we must use them in qemu, maybe discarding packets
>   in the process.
>   There already are commands for link down, I'm not sure there's value
>   in doing this automatically in qemu.
> - Alternatively, we should also have an option to stop vm automatically (like 
> we do on
>   disk errors) to reduce number of dropped packets.
> 
> 4. Reconnecting
> - When acting as a server, we might want an option to go back to
>   listening state, but it should not be the only option,
>   there should be a monitor command for moving it back to
>   listening state.
> - When acting as a client, auto-reconnect might be handy sometimes, but 
> should not be the only
>   option, there should be a way to manually request connect, possibly to
>   a different target, so a monitor command for re-connecting seems called for.
> - We'll also need monitor events for disconnects so management knows it
>   must re-connect/restart listening.
> - If we stopped VM, there could be an option to restart on reconnect.
> 
> 
> -- 
> MST

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-05-01 Thread Yuanhan Liu

On Sun, May 01, 2016 at 02:37:19PM +0300, Michael S. Tsirkin wrote:
> On Fri, Apr 29, 2016 at 10:48:35AM -0700, Yuanhan Liu wrote:
> > > But, I
> > > would worry first about a backend that crashes that it may corrupt the
> > > VM memory too...
> > 
> > Not quite sure I follow this. But, backend just touches the virtio
> > related memory, so it will do no harm to the VM?
> 
> It crashed so apparently it's not behaving as designed -
> how can we be sure it does not harm the VM?

Hi Michael,

What I know so far, a crash might could corrupt the virtio memory in two
ways:

- vring_used_elem might be in stale state when we are in the middle of
  updating used vring. Say, we might have updated the "id" field, but
  leaving "len" untouched.

- vring desc buff might also be in stale state, when we are copying
  data into it.

However, the two issues will not be real issue, as used->idx is not
updated in both case. Thefore, those corrupted memory will not be
processed by guest. So, no harm I'd say. Or, am I missing anything?

BTW, Michael, what's your thoughts on the whole reconnect stuff?

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-29 Thread Yuanhan Liu

On Fri, Apr 29, 2016 at 12:40:09PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Apr 28, 2016 at 7:23 AM, Yuanhan Liu
>  wrote:
> > On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lur...@redhat.com wrote:
> >> From: Marc-André Lureau 
> >> +Slave message types
> >> +---
> >> +
> >> + * VHOST_USER_SLAVE_SHUTDOWN:
> >> +  Id: 1
> >> +  Master payload: N/A
> >> +  Slave payload: u64
> >> +
> >> +  Request the master to shutdown the slave. A 0 reply is for
> >> +  success, in which case the slave may close all connections
> >> +  immediately and quit. A non-zero reply cancels the request.
> >> +
> >> +  Before a reply comes, the master may make other requests in
> >> +  order to flush or sync state.
> >
> > Hi all,
> >
> > I threw this proposal as well as DPDK's implementation to our customer
> > (OVS, Openstack and some other teams) who made such request before. I'm
> > sorry to say that none of them really liked that we can't handle crash.
> > Making reconnect work from a vhost-user backend crash is exactly something
> > they are after.
> 
> Handling crashes is not quite the same as what I propose here.

Yes, I know. However, handling crashes is exactly what our customers
want. And I just want to let you know that, say, I don't ask you to
do that :)

> I see
> it as a different goal. But I doubt about usefulness and reliability
> of a backend that crashes.

Agreed with you on that. However, I guess you have to admit that crashes
just happen. Kernel sometimes crashes, too. So, it would be great if
we could make whole stuff work again after an unexpected crash (say,
from OVS), without restarting all guests.

> In many case, vhost-user was designed after
> kernel vhost, and qemu code has the same expectation about the kernel
> or the vhost-user backend: many calls are sync and will simply
> assert() on unexpected results.

I guess we could at aleast try to dimish it, if we can't avoid it completely.

> > And to handle the crash, I was thinking of the proposal from Michael.
> > That is to do reset from the guest OS. This would fix this issue
> > ultimately. However, old kernel will not benefit from this, as well
> > as other guest other than Linux, making it not that useful for current
> > usage.
> 
> Yes, I hope Michael can help with that, I am not very familiar with
> the kernel code.
> 
> > Thinking of that the VHOST_USER_SLAVE_SHUTDOWN just gives QEMU a chance
> > to get the vring base (last used idx) from the backend, Huawei suggests
> 
> Right, but after this message, the backend should have flushed all
> pending ring packets and stop processing them. So it's also a clean
> sync point.
> 
> > that we could still make it in a consistent state after the crash, if
> > we get the vring base from vring->used->idx.  That worked as expected
> 
> You can have a backend that would have already processed packets and
> not updated used idx. You could also have out-of-order packets in
> flights (ex: desc 1-2-3 avail, 1-3 used, 2 pending..). I can't see a
> clean way to restore this, but to reset the queues and start over,
> with either packet loss or packet duplication.

Judging that it (crash or restart) happens so rare, I don't think
it matters. Moreoever, doesn't that happen in real world :)

> If the backend
> guarantees to process packets in order, it may simplify things, but it
> would be a special case.

Well, it's more like a backend thing: it's the backend to try to
set a saner vring base as stated in above proposal. Therefore, I
will not say it's a special case.

> 
> > from my test. The only tricky thing might be how to detect a crash,
> > and we could do a simple compare of the vring base from QEMU with
> > the vring->used->idx at the initiation stage. If mismatch found, get
> > it from vring->used->idx instead.
> 
> I don't follow, would the backend restore its last vring->used->idx
> after a crash?

Yes, restore from the SET_VRING_BASE from QEMU. But it's a stale value,
normally 0 if no start/stop happens before. Therefore, we can't use
it after the crash, instead, we could try to detect the mismatch and
try to fix it at SET_VRING_ADDR request.

> 
> > Comments/thoughts? Is it a solid enough solution to you?  If so, we
> > could make things much simpler, and what's most important, we could
> > be able to handle crash.
> 
> That's not so easy, many of the vhost_ops->vhost*() are followed by
> assert(r>0), which will be hard to change to handle failure. But, I
> would worry first about a backend that crashes that it may corrupt the
> VM memory too...

Not quite sure I follow this. But, backend just touches the virtio
related memory, so it will do no harm to the VM?

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-27 Thread Yuanhan Liu

On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  docs/specs/vhost-user.txt | 15 +++
>  hw/virtio/vhost-user.c| 16 
>  2 files changed, 31 insertions(+)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 8a635fa..60d6d13 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -487,3 +487,18 @@ Message types
>request to the master. It is passed in the ancillary data.
>This message is only sent if VHOST_USER_PROTOCOL_F_SLAVE_CHANNEL
>feature is available.
> +
> +Slave message types
> +---
> +
> + * VHOST_USER_SLAVE_SHUTDOWN:
> +  Id: 1
> +  Master payload: N/A
> +  Slave payload: u64
> +
> +  Request the master to shutdown the slave. A 0 reply is for
> +  success, in which case the slave may close all connections
> +  immediately and quit. A non-zero reply cancels the request.
> +
> +  Before a reply comes, the master may make other requests in
> +  order to flush or sync state.

Hi all,

I threw this proposal as well as DPDK's implementation to our customer
(OVS, Openstack and some other teams) who made such request before. I'm
sorry to say that none of them really liked that we can't handle crash.
Making reconnect work from a vhost-user backend crash is exactly something
they are after.

And to handle the crash, I was thinking of the proposal from Michael.
That is to do reset from the guest OS. This would fix this issue
ultimately. However, old kernel will not benefit from this, as well
as other guest other than Linux, making it not that useful for current
usage. 

Thinking of that the VHOST_USER_SLAVE_SHUTDOWN just gives QEMU a chance
to get the vring base (last used idx) from the backend, Huawei suggests
that we could still make it in a consistent state after the crash, if
we get the vring base from vring->used->idx.  That worked as expected
from my test. The only tricky thing might be how to detect a crash,
and we could do a simple compare of the vring base from QEMU with
the vring->used->idx at the initiation stage. If mismatch found, get
it from vring->used->idx instead.

Comments/thoughts? Is it a solid enough solution to you?  If so, we
could make things much simpler, and what's most important, we could
be able to handle crash.

--yliu

Re: [Qemu-devel] [PATCH 04/18] char: add wait support for reconnect

2016-04-27 Thread Yuanhan Liu

On Fri, Apr 01, 2016 at 01:16:14PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Until now, 'wait' was solely used for listening sockets. However, it can
> also be useful for 'reconnect' socket kind, where the first open must
> succeed before continuing.
> 
> This allows for instance (with other support patches) having vhost-user
> wait for an initial connection to setup the vhost-net, before eventually
> accepting new connections.

I have met a tricky issue about this patch. Assume the socket exist in
before we start QEMU, but somehow we can not connect to it, say due to
permission error. In such case, QEMU would wait there forever, without
any error messages. I would assume it's a bug, right?

--yliu
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  qemu-char.c | 31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/qemu-char.c b/qemu-char.c
> index 8702931..3e25c08 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -3659,7 +3659,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
> ChardevBackend *backend,
>Error **errp)
>  {
>  bool is_listen  = qemu_opt_get_bool(opts, "server", false);
> -bool is_waitconnect = is_listen && qemu_opt_get_bool(opts, "wait", true);
> +bool is_wait= qemu_opt_get_bool(opts, "wait", is_listen);
>  bool is_telnet  = qemu_opt_get_bool(opts, "telnet", false);
>  bool do_nodelay = !qemu_opt_get_bool(opts, "delay", true);
>  int64_t reconnect   = qemu_opt_get_number(opts, "reconnect", 0);
> @@ -3696,7 +3696,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
> ChardevBackend *backend,
>  sock->has_telnet = true;
>  sock->telnet = is_telnet;
>  sock->has_wait = true;
> -sock->wait = is_waitconnect;
> +sock->wait = is_wait;
>  sock->has_reconnect = true;
>  sock->reconnect = reconnect;
>  sock->tls_creds = g_strdup(tls_creds);
> @@ -4350,7 +4350,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  bool do_nodelay = sock->has_nodelay ? sock->nodelay : false;
>  bool is_listen  = sock->has_server  ? sock->server  : true;
>  bool is_telnet  = sock->has_telnet  ? sock->telnet  : false;
> -bool is_waitconnect = sock->has_wait? sock->wait: false;
> +bool is_wait= sock->has_wait? sock->wait: false;
>  int64_t reconnect   = sock->has_reconnect ? sock->reconnect : 0;
>  ChardevCommon *common = qapi_ChardevSocket_base(sock);
>  QIOChannelSocket *sioc = NULL;
> @@ -4424,7 +4424,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  }
>  
>  sioc = qio_channel_socket_new();
> -if (s->reconnect_time) {
> +if (s->reconnect_time && !is_wait) {
>  qio_channel_socket_connect_async(sioc, s->addr,
>   qemu_chr_socket_connected,
>   chr, NULL);
> @@ -4433,7 +4433,7 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  goto error;
>  }
>  s->listen_ioc = sioc;
> -if (is_waitconnect) {
> +if (is_wait) {
>  trace_char_socket_waiting(chr->filename);
>  tcp_chr_accept(QIO_CHANNEL(s->listen_ioc), G_IO_IN, chr);
>  }
> @@ -4443,9 +4443,24 @@ static CharDriverState *qmp_chardev_open_socket(const 
> char *id,
>  QIO_CHANNEL(s->listen_ioc), G_IO_IN, tcp_chr_accept, chr, 
> NULL);
>  }
>  } else {
> -if (qio_channel_socket_connect_sync(sioc, s->addr, errp) < 0) {
> -goto error;
> -}
> +do {
> +Error *err = NULL;
> +
> +if (qio_channel_socket_connect_sync(sioc, s->addr, &err) < 0) {
> +if (reconnect) {
> +trace_char_socket_reconnect_error(chr->label,
> +  error_get_pretty(err));
> +error_free(err);
> +continue;
> +} else {
> +error_propagate(errp, err);
> +goto error;
> +}
> +} else {
> +break;
> +}
> +} while (true);
> +
>  tcp_chr_new_client(chr, sioc);
>  object_unref(OBJECT(sioc));
>  }
> -- 
> 2.5.5

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-13 Thread Yuanhan Liu

On Wed, Apr 13, 2016 at 11:43:56PM +0200, Marc-André Lureau wrote:
> On Wed, Apr 13, 2016 at 7:32 PM, Yuanhan Liu
>  wrote:
> >>
> >> > I'm asking because I found a seg fault issue sometimes,
> >> > due to opaque is NULL.
> >
> > Oh, I was wrong, it's u being NULL, but not opaque.
> >> >
> >>
> >> I would be interested to see the backtrace or have a reproducer.
> >
> > It's a normal test steps: start a vhost-user switch (I'm using DPDK
> > vhost-switch example), kill it, and wait for a while (something like
> > more than 10s or even longer), then I saw a seg fault:
> >
> > (gdb) p dev
> > $4 = (struct vhost_dev *) 0x56571bf0
> > (gdb) p u
> > $5 = (struct vhost_user *) 0x0
> > (gdb) where
> > #0  0x55798612 in slave_read (opaque=0x56571bf0)
> > at /home/yliu/qemu/hw/virtio/vhost-user.c:539
> > #1  0x55a343a4 in aio_dispatch (ctx=0x5655f560) at 
> > /home/yliu/qemu/aio-posix.c:327
> > #2  0x55a2738b in aio_ctx_dispatch (source=0x5655f560, 
> > callback=0x0, user_data=0x0)
> > at /home/yliu/qemu/async.c:233
> > #3  0x751032a6 in g_main_context_dispatch () from 
> > /lib64/libglib-2.0.so.0
> > #4  0x55a3239e in glib_pollfds_poll () at 
> > /home/yliu/qemu/main-loop.c:213
> > #5  0x55a3247b in os_host_main_loop_wait (timeout=29875848) at 
> > /home/yliu/qemu/main-loop.c:258
> > #6  0x55a3252b in main_loop_wait (nonblocking=0) at 
> > /home/yliu/qemu/main-loop.c:506
> > #7  0x55846e35 in main_loop () at /home/yliu/qemu/vl.c:1934
> > #8  0x5584e6bf in main (argc=31, argv=0x7fffe078, 
> > envp=0x7fffe178)
> > at /home/yliu/qemu/vl.c:4658
> >
> 
> This patch set doesn't try to handle crashes from backend. This would
> require a much more detailed study of the existing code path. A lot of
> places assume the backend is fully working as expected. I think
> handling backend crashes should be a different, later, patch set.

Oh, sorry for not making it clear. I actually did the kill by "ctrl-c".
It then is captured to send a SLAVE_SHUTDOWN request.  So, I would say
it's a normal quit.

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-13 Thread Yuanhan Liu

On Wed, Apr 13, 2016 at 05:51:15AM -0400, Marc-André Lureau wrote:
> Hi
> 
> - Original Message -
> > Hi Marc,
> > 
> > First of all, sorry again for late response!
> > 
> > Last time I tried with your first version, I found few issues related
> > with reconnect, mainly on the acked_feautres lost. While checking your
> > new code, I found that you've already solved that, which is great.
> > 
> > So, I tried harder this time, your patches work great, except that I
> > found few nits.
> > 
> > On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lur...@redhat.com wrote:
> > > From: Marc-André Lureau 
> > ...
> > > +Slave message types
> > > +---
> > > +
> > > + * VHOST_USER_SLAVE_SHUTDOWN:
> > > +  Id: 1
> > > +  Master payload: N/A
> > > +  Slave payload: u64
> > > +
> > > +  Request the master to shutdown the slave. A 0 reply is for
> > > +  success, in which case the slave may close all connections
> > > +  immediately and quit.
> > 
> > Assume we are using ovs + dpdk here, that we could have two
> > vhost-user connections. While ovs tries to initiate a restart,
> > it might unregister the two connections one by one. In such
> > case, two VHOST_USER_SLAVE_SHUTDOWN request will be sent,
> > and two replies will get. Therefore, I don't think it's a
> > proper ask here to let the backend implementation to do quit
> > here.
> > 
> 
> On success reply, the master sent all the commands to finish the connection. 
> So the slave must flush/finish all pending requests first.

Yes, that's okay. I here just mean the "close __all__ connections"
and "quit" part.

Firstly, we should do cleanup/flush/finish to it's own connection.
But not all, right?

Second, as stated, doing quit might not make sense, as we may
have more connections.

> I think this should be enough, otherwise we may need a new explicit message?
> 
> > 
> > >  
> > >  switch (msg.request) {
> > > +case VHOST_USER_SLAVE_SHUTDOWN: {
> > > +uint64_t success = 1; /* 0 is for success */
> > > +if (dev->stop) {
> > > +dev->stop(dev);
> > > +success = 0;
> > > +}
> > > +msg.payload.u64 = success;
> > > +msg.size = sizeof(msg.payload.u64);
> > > +size = send(u->slave_fd, &msg, VHOST_USER_HDR_SIZE + msg.size, 
> > > 0);
> > > +if (size != VHOST_USER_HDR_SIZE + msg.size) {
> > > +error_report("Failed to write reply.");
> > > +}
> > > +break;
> > 
> > You might want to remove the slave_fd from watch list? We
> > might also need to close slave_fd here, assuming that we
> > will no longer use it when VHOST_USER_SLAVE_SHUTDOWN is
> > received?
> 
> Makes sense, I will change that in next update.
> 
> > I'm asking because I found a seg fault issue sometimes,
> > due to opaque is NULL.

Oh, I was wrong, it's u being NULL, but not opaque.
> >
> 
> I would be interested to see the backtrace or have a reproducer.

It's a normal test steps: start a vhost-user switch (I'm using DPDK
vhost-switch example), kill it, and wait for a while (something like
more than 10s or even longer), then I saw a seg fault:

(gdb) p dev
$4 = (struct vhost_dev *) 0x56571bf0
(gdb) p u
$5 = (struct vhost_user *) 0x0
(gdb) where
#0  0x55798612 in slave_read (opaque=0x56571bf0)
at /home/yliu/qemu/hw/virtio/vhost-user.c:539
#1  0x55a343a4 in aio_dispatch (ctx=0x5655f560) at 
/home/yliu/qemu/aio-posix.c:327
#2  0x55a2738b in aio_ctx_dispatch (source=0x5655f560, 
callback=0x0, user_data=0x0)
at /home/yliu/qemu/async.c:233
#3  0x751032a6 in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
#4  0x55a3239e in glib_pollfds_poll () at 
/home/yliu/qemu/main-loop.c:213
#5  0x55a3247b in os_host_main_loop_wait (timeout=29875848) at 
/home/yliu/qemu/main-loop.c:258
#6  0x55a3252b in main_loop_wait (nonblocking=0) at 
/home/yliu/qemu/main-loop.c:506
#7  0x55846e35 in main_loop () at /home/yliu/qemu/vl.c:1934
#8  0x5584e6bf in main (argc=31, argv=0x7fffe078, 
envp=0x7fffe178)
at /home/yliu/qemu/vl.c:4658


--yliu

Re: [Qemu-devel] [PATCH 18/18] test: add shutdown support vubr test

2016-04-12 Thread Yuanhan Liu

On Fri, Apr 01, 2016 at 01:16:28PM +0200, marcandre.lur...@redhat.com wrote:
> +static void
> +vubr_handle_slave_reply(VhostUserMsg *vmsg)
> +{
> +DPRINT(
> +"==   Vhost slave reply from QEMU   
> ==\n");
> +DPRINT("Request: %s (%d)\n", vubr_slave_request_str[vmsg->request],
> +   vmsg->request);
> +DPRINT("Flags:   0x%x\n", vmsg->flags);
> +DPRINT("Size:%d\n", vmsg->size);
> +
> +switch (vmsg->request) {
> +case VHOST_USER_SLAVE_SHUTDOWN:
> +DPRINT("Shutdown success: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +if (vmsg->payload.u64 == 0) {
> +exit(0);
> +}
> +default:
> +DPRINT("Invalid slave reply");
> +};
   ^^

Minor nit: redundant ';'.

--yliu

Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support

2016-04-12 Thread Yuanhan Liu

Hi Marc,

First of all, sorry again for late response!

Last time I tried with your first version, I found few issues related
with reconnect, mainly on the acked_feautres lost. While checking your
new code, I found that you've already solved that, which is great.

So, I tried harder this time, your patches work great, except that I
found few nits.

On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
...
> +Slave message types
> +---
> +
> + * VHOST_USER_SLAVE_SHUTDOWN:
> +  Id: 1
> +  Master payload: N/A
> +  Slave payload: u64
> +
> +  Request the master to shutdown the slave. A 0 reply is for
> +  success, in which case the slave may close all connections
> +  immediately and quit.

Assume we are using ovs + dpdk here, that we could have two
vhost-user connections. While ovs tries to initiate a restart,
it might unregister the two connections one by one. In such
case, two VHOST_USER_SLAVE_SHUTDOWN request will be sent,
and two replies will get. Therefore, I don't think it's a
proper ask here to let the backend implementation to do quit
here.

>  
>  switch (msg.request) {
> +case VHOST_USER_SLAVE_SHUTDOWN: {
> +uint64_t success = 1; /* 0 is for success */
> +if (dev->stop) {
> +dev->stop(dev);
> +success = 0;
> +}
> +msg.payload.u64 = success;
> +msg.size = sizeof(msg.payload.u64);
> +size = send(u->slave_fd, &msg, VHOST_USER_HDR_SIZE + msg.size, 0);
> +if (size != VHOST_USER_HDR_SIZE + msg.size) {
> +error_report("Failed to write reply.");
> +}
> +break;

You might want to remove the slave_fd from watch list? We
might also need to close slave_fd here, assuming that we
will no longer use it when VHOST_USER_SLAVE_SHUTDOWN is
received?

I'm asking because I found a seg fault issue sometimes,
due to opaque is NULL.

--yliu

Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection

2016-03-29 Thread Yuanhan Liu

On Tue, Mar 29, 2016 at 12:52:32PM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Mar 29, 2016 at 10:10 AM, Yuanhan Liu
>  wrote:
> > Backend crash may be not that normal in production usage, it is in
> > development stage. It would be handy if we can handle that as well,
> > as it's a painful experience for me that every time I do a VM2VM test
> > and once my changes to backend causes a crash (unfortunately, it
> > happens often in early stage), I have to reboot the two VMs.
> >
> > If we have reconnect, life could be easier. I then don't need worry
> > that I will mess the backend and crash it any more.
> >
> > And again, the reason I mentioned crash here again is not because
> > we need handle it, specially. Instead, I was thinking we might be
> > able to handle the two reasons both.
> 
> I think crash could be handle with queue reset Michael proposed, but
> that would probably require guest side changes.

Is the reply from Michael in this thread the whole story? If not, would
you please give me a link of that discussion?

> 
> >> Conceptually, I think if we allow the backend to disconnect, it makes
> >> sense that qemu is actually the socket server. But it doesn't matter
> >> much, it's simple to teach qemu to reconnect a timer...
> >
> > We already have that, right? I mean, the char-dev "reconnect" option.
> 
> Sure, what I mean is that it sounds cleaner from a design pov for qemu
> to be the server (since it is the one actually waiting for backend in
> this case),

Yeah, I agree with you on that. However, thinking in this way: let QEMU
be the server and backend be the client, the client still need to keep
trying to connect the server, when the server crashes/quits/restarts.

My point is, in either way, we need bear in mind that server could also
be down (due to crashes/restarts), that we have to teach the client to
do reconnect when disconnected. Judging that QEMU already has the support,
I'd slightly prefer to let QEMU still be the client, and do the reconnect
tries, if it works.

And to make it clear, both should work, but it's more like a question
which one will be a better option (to us, DPDK), QEMU be the server or
the client?  What's sure here is that, in either way, your patches would
work and are required.

> beside a timer is often a pretty rough solution to
> anything.

As stated above, I'm afraid that is somehow needed. It might be in QEMU
or backend, though.

> >> So we should
> >> probably allow both cases anyway.
> >
> > Yes, I think both should work. I may be wrong (that I may miss
> > something), but it seems it's (far) easier to me to keep QEMU
> > as the client, and adding the "reconnect" option, judging that
> > we have all stuff to make it work ready. In this way,
> >
> > - if backend crashes/quits, you just need restart the backend,
> >   and QEMU will retry the connection.
> >
> > - if QEMU crashes/quits, you just need restart the QEMU, and
> >   then QEMU will start a fresh connection.
> 
> It may all depend on use cases, it's not more obvious or easy than the
> other to me.
> 
> > However, if let QEMU as the server, there are 2 major works need
> > to be done in the backend side (take DPDK as example, that just
> > acts as vhost-user server only so far):
> >
> > - Introducing a new API or extending current API to let it to
> >   connect the server socket from QEMU.
> >
> > - If QEMU crashes/quits, we need add code to backend to keep
> >   reconnecting unless connection is established, which is something
> >   similar to the "reconnect" stuff in QEMU.
> > As you can see, it needs more effort (though it's not something
> > you care :). And it has duplicate work.
> 
> 
> Ah, I am looking at this from qemu angle, backend may need to adapt if
> it doesn't already handle both "socket role" (client & server).

Agreed.

> >
> >>
> >> > - start DPDK vhost-switch example
> >> >
> >> > - start QEMU, which will connect to DPDK vhost-user
> >> >
> >> >   link is good now.
> >> >
> >> > - kill DPDK vhost-switch
> >> >
> >> >   link is broken at this stage
> >> >
> >> > - start DPDK vhost-switch again
> >> >
> >> >   you will find that the link is back again.
> >> >
> >> >
> >> > Will that makes sense to you? If so, we may need do nothing (or just
> >> > very few) changes at all to DPDK to get the reconnect work.
> >>
> >>

Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection

2016-03-29 Thread Yuanhan Liu

On Fri, Mar 25, 2016 at 07:00:15PM +0100, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Mar 24, 2016 at 8:10 AM, Yuanhan Liu
>  wrote:
> >> > The following series starts from the idea that the slave can request a
> >> > "managed" shutdown instead and later recover (I guess the use case for
> >> > this is to allow for example to update static dispatching/filter rules
> >> > etc)
> >
> > What if the backend crashes, that no such request will be sent? And
> > I'm wondering why this request is needed, as we are able to detect
> > the disconnect now (with your patches).

Hi, sorry for late and thanks for the input.

(CC'ed more guys, as they are the hungry users that wish this feature
 to be done ASAP; they may provide more valuable inputs)

> 
> I don't think trying to handle backend crashes is really a thing we
> need to take care of.

Yes, me neither. The reason I mentioned backend crash is that it's
one of the reasons why we need vhost-user reconnect stuff. And the
another reason is the backend restart.

Backend crash may be not that normal in production usage, it is in
development stage. It would be handy if we can handle that as well,
as it's a painful experience for me that every time I do a VM2VM test
and once my changes to backend causes a crash (unfortunately, it
happens often in early stage), I have to reboot the two VMs.

If we have reconnect, life could be easier. I then don't need worry
that I will mess the backend and crash it any more.

And again, the reason I mentioned crash here again is not because
we need handle it, specially. Instead, I was thinking we might be
able to handle the two reasons both.

> If the backend is bad enough to crash, it may as
> well corrupt the guest memory (mst: my understanding of vhost-user is
> that backend must be trusted, or it could just throw garbage in the
> queue descriptors with surprising consequences or elsewhere in the
> guest memory actually, right?).
> 
> > BTW, you meant to let QEMU as the server and the backend as the client
> > here, right? Honestly, that's what we've thought of, too, in the first
> > time.
> > However, I'm wondering could we still go with the QEMU as the client
> > and the backend as the server (the default and the only way DPDK
> > supports), and let QEMU to try to reconnect when the backend crashes
> > and restarts. In such case, we need enable the "reconnect" option
> > for vhost-user, and once I have done that, it basically works in my
> > test:
> >
> 
> Conceptually, I think if we allow the backend to disconnect, it makes
> sense that qemu is actually the socket server. But it doesn't matter
> much, it's simple to teach qemu to reconnect a timer...

We already have that, right? I mean, the char-dev "reconnect" option.

> So we should
> probably allow both cases anyway.

Yes, I think both should work. I may be wrong (that I may miss
something), but it seems it's (far) easier to me to keep QEMU
as the client, and adding the "reconnect" option, judging that
we have all stuff to make it work ready. In this way,

- if backend crashes/quits, you just need restart the backend,
  and QEMU will retry the connection.

- if QEMU crashes/quits, you just need restart the QEMU, and
  then QEMU will start a fresh connection.


However, if let QEMU as the server, there are 2 major works need
to be done in the backend side (take DPDK as example, that just
acts as vhost-user server only so far):

- Introducing a new API or extending current API to let it to
  connect the server socket from QEMU.

- If QEMU crashes/quits, we need add code to backend to keep
  reconnecting unless connection is established, which is something
  similar to the "reconnect" stuff in QEMU.

As you can see, it needs more effort (though it's not something
you care :). And it has duplicate work.

> 
> > - start DPDK vhost-switch example
> >
> > - start QEMU, which will connect to DPDK vhost-user
> >
> >   link is good now.
> >
> > - kill DPDK vhost-switch
> >
> >   link is broken at this stage
> >
> > - start DPDK vhost-switch again
> >
> >   you will find that the link is back again.
> >
> >
> > Will that makes sense to you? If so, we may need do nothing (or just
> > very few) changes at all to DPDK to get the reconnect work.
> 
> The main issue with handling crashes (gone at any time) is that the
> backend my not have time to sync the used idx (at the least). It may
> already have processed incoming packets, so on reconnect, it may
> duplicate the receiving/dispatching work.

That's not the case for DPDK backend implementation: incoming packets
won

Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection

2016-03-24 Thread Yuanhan Liu

On Thu, Nov 26, 2015 at 12:33:22PM +0200, Michael S. Tsirkin wrote:
> On Wed, Sep 09, 2015 at 01:09:52AM +0200, marcandre.lur...@redhat.com wrote:
> > From: Marc-André Lureau 
> > 
> > In a previous series "Add feature to start QEMU without vhost-user
> > backend", Tetsuya Mukawa proposed to allow the vhost-user backend to
> > disconnect and reconnect. However, Michael Tsirkin pointed out that
> > you can't do that without extra care, because the guest and hypervisor
> > don't know the slave ring manipulation state, there might be pending
> > replies for example that could be lost, and suggested to reset the
> > guest queues, but this requires kernel changes, and it may have to
> > clear the ring and lose queued packets.

Hi Michael & Marc,

I said I'd would like to have a try last week, and I did. However, I
just be able to have a closer look at the code recently.

> > The following series starts from the idea that the slave can request a
> > "managed" shutdown instead and later recover (I guess the use case for
> > this is to allow for example to update static dispatching/filter rules
> > etc)

What if the backend crashes, that no such request will be sent? And
I'm wondering why this request is needed, as we are able to detect
the disconnect now (with your patches).

BTW, you meant to let QEMU as the server and the backend as the client
here, right? Honestly, that's what we've thought of, too, in the first
time.

However, I'm wondering could we still go with the QEMU as the client
and the backend as the server (the default and the only way DPDK
supports), and let QEMU to try to reconnect when the backend crashes
and restarts. In such case, we need enable the "reconnect" option
for vhost-user, and once I have done that, it basically works in my
test:

- start DPDK vhost-switch example

- start QEMU, which will connect to DPDK vhost-user

  link is good now.

- kill DPDK vhost-switch

  link is broken at this stage

- start DPDK vhost-switch again

  you will find that the link is back again.

Will that makes sense to you? If so, we may need do nothing (or just
very few) changes at all to DPDK to get the reconnect work.

--yliu

> I'm still not sure users actually need this.  I am inclined to think we
> should teach guests to respond to NEED_RESET status. Then to handle
> disconnect, we would
> - deactivate the disconnected backend
> - stop VM, and wait for a reconnect
> - set NEED_RESET status, and re-activate the backend
>   after guest reset
> 
> -- 
> MST

Re: [Qemu-devel] chardev: vhost-user: reconnect issue when QEMU as server

2016-03-09 Thread Yuanhan Liu

On Wed, Mar 09, 2016 at 04:04:45PM +0100, Marc-André Lureau wrote:
> Hi
> 
> On Wed, Mar 9, 2016 at 2:29 PM, Yuanhan Liu  
> wrote:
> > I could spend more time to dig it. But before that, I want to know
> > how you guys think about it?  Does that proposal makes sense to you?
> > Or, any better ideas?
> 
> See also this thread, which I have recently rebased:
> https://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg02172.html

Brillient! Hopefully I can get some time to have a detailed look at
the code and give some tries next week.

BTW, one question: what's the current status of it? Why the last version
is half year ago?

--yliu

> 
> I am trying to get the following series reviewed/merged before
> discussion the rest
> https://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg05419.html
> 
> Help welcome
> 
> -- 
> Marc-André Lureau

[Qemu-devel] chardev: vhost-user: reconnect issue when QEMU as server

2016-03-09 Thread Yuanhan Liu

Hi,

Currently, the typical usage of vhost-user is QEMU as the client, while
the backend (say DPDK vhost-user) be the server. There is a major drawback:
the restart of the backend (say, upgrade) needs restart the QEMU.

The obvious solution would be let QEMU be the server and the backend
be the client. I did a quick try before, and it would work as expected,
if it's the first time the client tries to connect the server. However,
it would not, if it's the second (or later) times, say a client restart.

Per my understanding so far, there are two factors caused such issue:

1) QEMU socket char dev is designed as one server accepts one
   client only. A server will not accept another client unless
   there is no connection established, or the connection is
   disconnected.

2) For vhost-user case, QEMU serves as the initiator, that sends
   vhost-user messages on his own. In another word, QEMU will
   not poll the socket fd, unlike DPDK vhost-user. That is to say
   QEMU has no idea to know when the connection is disconnected.
   According to 1), QEMU will not be able to accept the second
   connect from DPDK, when DPDK vhost restarts.

In summary, we need figure out a way to detect the disconnect at QEMU.
I came up with a solution that might work: we poll the socket fd, and
peek (instead of read) the data. Peek would fail if a connection is
disconnected. I had a quick hack, unluckily, it didn't work as expected.
I'm thinking I might have missed something.

I could spend more time to dig it. But before that, I want to know
how you guys think about it?  Does that proposal makes sense to you?
Or, any better ideas?

Thanks.

--yliu

Re: [Qemu-devel] [PATCH] vhost: drop dead code

2015-12-02 Thread Yuanhan Liu


On Wed, Dec 02, 2015 at 01:50:35PM +0200, Michael S. Tsirkin wrote:
> We dropped the implementation of vhost_dev_query,
> drop it from the header file as well.
> 
> Signed-off-by: Michael S. Tsirkin 


Reviewed-by: Yuanhan Liu 

--yliu
> ---
>  include/hw/virtio/vhost.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 7437fd4..b60d758 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -66,7 +66,6 @@ struct vhost_dev {
>  int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> VhostBackendType backend_type);
>  void vhost_dev_cleanup(struct vhost_dev *hdev);
> -bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev);
>  int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev);
>  void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev);
>  int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev);
> -- 
> MST

Re: [Qemu-devel] [PATCH] Revert "vhost: send SET_VRING_ENABLE at start/stop"

2015-11-26 Thread Yuanhan Liu

On Thu, Nov 26, 2015 at 06:52:56PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 26, 2015 at 09:46:08AM +0800, Yuanhan Liu wrote:
> > On Wed, Nov 25, 2015 at 02:42:05PM +0200, Michael S. Tsirkin wrote:
> > > This reverts commit 3a12f32229a046f4d4ab0a3a52fb01d2d5a1ab76.
> > > 
> > > In case of live migration several queues can be enabled and not only the
> > > first one. So informing backend that only the first queue is enabled is
> > > wrong.
> > 
> > Reviewed-by: Yuanhan Liu 
> > 
> > BTW, we should also update the spec about ring stop, right?
> > 
> > --yliu
> 
> Pls take a look at for_upstream in my tree, and tell me
> whether there's anything we need to clarify.

I had a quick check, and found it's fine. I was thinking we are
still clarifying that there are two ways for signing a ring stop:
VHOST_USER_GET_VRING_BASE and VHOST_UESR_SET_VRING_ENABLE.

Sorry for the noisy then.

--yliu

Re: [Qemu-devel] [PATCH] Revert "vhost: send SET_VRING_ENABLE at start/stop"

2015-11-25 Thread Yuanhan Liu

On Wed, Nov 25, 2015 at 02:42:05PM +0200, Michael S. Tsirkin wrote:
> This reverts commit 3a12f32229a046f4d4ab0a3a52fb01d2d5a1ab76.
> 
> In case of live migration several queues can be enabled and not only the
> first one. So informing backend that only the first queue is enabled is
> wrong.

Reviewed-by: Yuanhan Liu 

BTW, we should also update the spec about ring stop, right?

--yliu
> 
> Reported-by: Thibaut Collet 
> Cc: Yuanhan Liu 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/virtio/vhost.c | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 1794f0d..de29968 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1226,11 +1226,6 @@ int vhost_dev_start(struct vhost_dev *hdev, 
> VirtIODevice *vdev)
>  }
>  }
>  
> -if (hdev->vhost_ops->vhost_set_vring_enable) {
> -/* only enable first vq pair by default */
> -hdev->vhost_ops->vhost_set_vring_enable(hdev, hdev->vq_index == 0);
> -}
> -
>  return 0;
>  fail_log:
>  vhost_log_put(hdev, false);
> @@ -1261,10 +1256,6 @@ void vhost_dev_stop(struct vhost_dev *hdev, 
> VirtIODevice *vdev)
>   hdev->vq_index + i);
>  }
>  
> -if (hdev->vhost_ops->vhost_set_vring_enable) {
> -hdev->vhost_ops->vhost_set_vring_enable(hdev, 0);
> -}
> -
>  vhost_log_put(hdev, true);
>  hdev->started = false;
>  hdev->log = NULL;
> -- 
> MST

Re: [Qemu-devel] [PATCH for-2.5 1/1] vhost-user: do not send SET_VRING_ENABLE at start

2015-11-24 Thread Yuanhan Liu

On Tue, Nov 24, 2015 at 11:23:34PM +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 24, 2015 at 10:05:27PM +0100, Thibaut Collet wrote:
> > On Tue, Nov 24, 2015 at 9:52 PM, Michael S. Tsirkin  wrote:
> > > On Tue, Nov 24, 2015 at 05:10:36PM +0100, Thibaut Collet wrote:
> > >> This patch reverts partially commit 3a12f32229a.
> > >>
> > >> In case of live migration several queues can be enabled and not only the 
> > >> first
> > >> one. So inform backend that only the first queue is enabled is wrong.
> > >>
> > >> Since commit 7263a0ad7899 backend is already notified of the state of 
> > >> the vring
> > >> through the vring attach operation. This function, called during the 
> > >> startup
> > >> sequence, provides the correct state of the vring, even in case of live
> > >> migration.
> > >>
> > >> So nothing has to be added to give the vring state to the backend at the 
> > >> startup.
> > >>
> > >> Signed-off-by: Thibaut Collet 
> > >> ---
> > >>  hw/virtio/vhost.c | 5 -
> > >>  1 file changed, 5 deletions(-)
> > >>
> > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > >> index 1794f0d..870cd12 100644
> > >> --- a/hw/virtio/vhost.c
> > >> +++ b/hw/virtio/vhost.c
> > >> @@ -1226,11 +1226,6 @@ int vhost_dev_start(struct vhost_dev *hdev, 
> > >> VirtIODevice *vdev)
> > >>  }
> > >>  }
> > >>
> > >> -if (hdev->vhost_ops->vhost_set_vring_enable) {
> > >> -/* only enable first vq pair by default */
> > >> -hdev->vhost_ops->vhost_set_vring_enable(hdev, hdev->vq_index == 
> > >> 0);
> > >> -}
> > >> -
> > >>  return 0;
> > >>  fail_log:
> > >>  vhost_log_put(hdev, false);
> > >> --
> > >> 2.1.4
> > >
> > > Yes - and I'm beginning to think that maybe we should revert
> > > all of 3a12f32229a then, for symmetry.
> > >
> > 
> > Keep the disable vring on the stop can be useful. For example if the
> > VM is rebooted all the vring will be disabled and backend will avoid
> > to send packet to the VM in this case (I am not sure the virtio ring
> > address is always valid during a reboot and writingg data in this
> > memory can cause unexpected behaviour in this case).
> 
> I think there's still some confusion:
> writing memory can still happen even if you disable the ring
> since the TX ring is still processed so we write into the used ring.
> 
> We call GET_VRING_BASE on stop and that ensures rings are
> stopped.

Yes, that's what I suggested first, which also makes the logic quite
simple: we use GET_VRING_BASE as the sign of vring stop. Intead of
GET_VRING_BASE when protocol not negotiated, and SET_VRING_ENABLE
when protocol is negotiated.

Michael, should I submit a revert patch, or you could do it directly?

--yliu
> 
> 
> > > Yunnan, Victor - what do you think?
> > >
> > > --
> > > MST

Re: [Qemu-devel] [PATCH for-2.5] vhost-user: clarify start and enable

2015-11-23 Thread Yuanhan Liu

On Mon, Nov 23, 2015 at 12:54:56PM +0200, Michael S. Tsirkin wrote:
> It seems that we currently have some duplication between
> started and enabled states.
> 
> The actual reason is that enable is not documented correctly:
> what it does is connecting ring to the backend.
> 
> This is important for MQ, because a Linux guest expects TX
> packets to be completed even if it disables some queues
> temporarily.

Thanks for the clarification. And

Reviewed-by: Yuanhan Liu 

    --yliu
> 
> Cc: Yuanhan Liu 
> Cc: Victor Kaplansky 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  docs/specs/vhost-user.txt | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 7b9cd6d..0312d40 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -148,13 +148,23 @@ a feature bit was dedicated for this purpose:
>  
>  Starting and stopping rings
>  --
> -Client must only process each ring when it is both started and enabled.
> +Client must only process each ring when it is started.
> +
> +Client must only pass data between the ring and the
> +backend, when the ring is enabled.
> +
> +If ring is started but disabled, client must process the
> +ring without talking to the backend.
> +
> +For example, for a networking device, in the disabled state
> +client must not supply any new RX packets, but must process
> +and discard any TX packets.
>  
>  If VHOST_USER_F_PROTOCOL_FEATURES has not been negotiated, the ring is 
> initialized
>  in an enabled state.
>  
>  If VHOST_USER_F_PROTOCOL_FEATURES has been negotiated, the ring is 
> initialized
> -in a disabled state. Client must not process it until ring is enabled by
> +in a disabled state. Client must not pass data to/from the backend until 
> ring is enabled by
>  VHOST_USER_SET_VRING_ENABLE with parameter 1, or after it has been disabled 
> by
>  VHOST_USER_SET_VRING_ENABLE with parameter 0.
>  
> @@ -166,7 +176,7 @@ descriptor is readable) on the descriptor specified by
>  VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
>  VHOST_USER_GET_VRING_BASE.
>  
> -While processing the rings (when they are started and enabled), client must
> +While processing the rings (whether they are enabled or not), client must
>  support changing some configuration aspects on the fly.
>  
>  Multiple queue support
> @@ -309,11 +319,11 @@ Message types
>Id: 4
>Master payload: N/A
>  
> -  This is no longer used. Used to be sent to request stopping
> +  This is no longer used. Used to be sent to request disabling
>all rings, but some clients interpreted it to also discard
>connection state (this interpretation would lead to bugs).
>It is recommended that clients either ignore this message,
> -  or use it to stop all rings.
> +  or use it to disable all rings.
>  
>   * VHOST_USER_SET_MEM_TABLE
>  
> -- 
> MST

Re: [Qemu-devel] [PATCH v4 0/5] handle vhost reset/start/stop correctly

2015-11-13 Thread Yuanhan Liu

On Fri, Nov 13, 2015 at 12:24:52PM +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 13, 2015 at 10:03:29AM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 12, 2015 at 04:44:19PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Nov 12, 2015 at 10:08:15PM +0800, Yuanhan Liu wrote:
> > > > On Thu, Nov 12, 2015 at 03:33:41PM +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Nov 11, 2015 at 09:24:36PM +0800, Yuanhan Liu wrote:
> > > > > > 
> > > > > > Patch 1 rename RESET_DEVICE back to RESET_OWNER
> > > > > > 
> > > > > > Patch 2 introduced a new function: vhost_net_reset(), which is 
> > > > > > invoked
> > > > > > when reset happens, say, driver is unloaded (unbinded).
> > > > > > 
> > > > > > Michael suggested to do that only when MQ is not negotiated.
> > > > > > However, reset happens no matter MQ is enabled or negotiated
> > > > > > or not, and we should give a sign to backend to reset some
> > > > > > device to a proper state after it happens.
> > > > > 
> > > > > I don't think it's needed: we set all state at start anyway.
> > > > 
> > > > Agree with that.
> > > > 
> > > > > 
> > > > > > Note that the message sent is still RESET_OWNER. It might 
> > > > > > not
> > > > > > be a good idea, but we could not simply rename it to 
> > > > > > RESET_DEVICE,
> > > > > > and maybe adding another RESET_DEVICE might be better.
> > > > > 
> > > > > So modern clients don't need this at all.  Old clients need something 
> > > > > to
> > > > > stop device, but singling out reset is not a good idea: even if driver
> > > > > is unloaded, you need to do that.  snabbswitch (ab)uses RESET_OWNER 
> > > > > for
> > > > > this, so maybe RESET_OWNER should be renamed DISABLE_ALL, to just make
> > > > > it stop all queues.
> > > > > 
> > > > > Does any dpdk version that was released respond to RESET_OWNER in some
> > > > > way? How exactly?
> > > > 
> > > > It just resets some states, such as closing call fd and kick fd,
> > > > unmapping buf from hugetlbfs from set_mem_table.
> > > 
> > > And is this in some dpdk version that has been released?
> > 
> > Yes, it's been there long time ago, and you can find them in recent
> > releases (say, v2.1).
> > 
> > However, QEMU just added RESET_OWNER since v2.4.0, and it's likely that
> > we will remove it since this version, v2.5.0.
> 
> 
> I see. So I think that if MQ is negotiated, we should not send it.
> VRING_ENABLE(0) is enough.  But what if MQ is not negotiated?
> In any case, it must be done after all rings are stopped in vhost_dev_stop.
> Otherwise we can not retrieve the ring index so we can't resume.
> 
> I'm not sure it's worth the hassle.
> To work correctly, when MQ is not negotiated then clients
> need to stop ring on GET_VRING_BASE, and that seems enough
> to make everything work correctly across reboot.
> If MQ is negotiated then there's VRING_ENABLE so hacks
> are not needed anymore.

I'm okay with that. And if so, we may need put those kind of information
to vhost-user spec, to make it a standard.

Or, I'm wondering could we make it be consistent: sending GET_VRING_BASE
as the only sign of vhost dev stop, no matter MQ is negotiated or not?

And I'm re-think what's the necessary of sending VRING_ENABLE at start/stop?
Just because it's a better name than GET_VRING_BASE? With current code,
VRING_ENABLE will be sent in following several places:

- virtio_net_set_queues, which will be invoked in the virtio net
  initiation stage. And it also will be invoked after user executing
  "ethtool -L" command.

- vhost_dev_start

- vhost_dev_stop

We might need take different actions for each above cases, say we need
release some resources (such as closing the kick fd) on vhost_dev_stop.
But we definitely could not do that for VRING_ENABLE messages from
ethtool as user can re-run the ethtool command to bring those vrings
back to functional: doing those kind of resource release would not be
able to bring it back.

So, it might be difficult (or more accurate, messy) for the client to
figure out which case it is: at which time we should release some
resources, and at which time we should not.

--yliu
> 
> Cc Luke for an o

[Qemu-devel] [PATCH 1/2] vhost: let SET_VRING_ENABLE message depends on protocol feature

2015-11-12 Thread Yuanhan Liu

But not depend on PROTOCOL_F_MQ feature bit. So that we could use
SET_VRING_ENABLE to sign the backend on stop, even if MQ is disabled.

That's reasonable, since we will have one queue pair at least.

Signed-off-by: Yuanhan Liu 
---
 hw/virtio/vhost-user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4766f98..a9c8335 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -331,7 +331,7 @@ static int vhost_user_set_vring_enable(struct vhost_dev 
*dev, int enable)
 .num   = enable,
 };
 
-if (!(dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ))) {
+if (!virtio_has_feature(dev->features, VHOST_USER_F_PROTOCOL_FEATURES)) {
 return -1;
 }
 
-- 
1.9.3

[Qemu-devel] [PATCH 2/2] vhost: don't send RESET_OWNER at stop

2015-11-12 Thread Yuanhan Liu

First of all, RESET_OWNER message is sent incorrectly, as it's sent
before GET_VRING_BASE. And the reset message would let the later call
get nothing correct.

And, sending SET_VRING_ENABLE at stop, which has already been done,
makes more sense than RESET_OWNER.

Signed-off-by: Yuanhan Liu 
---
 hw/net/vhost_net.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d91b7b1..14337a4 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -292,12 +292,6 @@ static void vhost_net_stop_one(struct vhost_net *net,
 int r = vhost_ops->vhost_net_set_backend(&net->dev, &file);
 assert(r >= 0);
 }
-} else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
-for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_reset_device(&net->dev);
-assert(r >= 0);
-}
 }
 if (net->nc->info->poll) {
 net->nc->info->poll(net->nc, true);
-- 
1.9.3

Re: [Qemu-devel] [PATCH v4 0/5] handle vhost reset/start/stop correctly

2015-11-12 Thread Yuanhan Liu

On Thu, Nov 12, 2015 at 04:44:19PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 12, 2015 at 10:08:15PM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 12, 2015 at 03:33:41PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Nov 11, 2015 at 09:24:36PM +0800, Yuanhan Liu wrote:
> > > > 
> > > > Patch 1 rename RESET_DEVICE back to RESET_OWNER
> > > > 
> > > > Patch 2 introduced a new function: vhost_net_reset(), which is invoked
> > > > when reset happens, say, driver is unloaded (unbinded).
> > > > 
> > > > Michael suggested to do that only when MQ is not negotiated.
> > > > However, reset happens no matter MQ is enabled or negotiated
> > > > or not, and we should give a sign to backend to reset some
> > > > device to a proper state after it happens.
> > > 
> > > I don't think it's needed: we set all state at start anyway.
> > 
> > Agree with that.
> > 
> > > 
> > > > Note that the message sent is still RESET_OWNER. It might not
> > > > be a good idea, but we could not simply rename it to 
> > > > RESET_DEVICE,
> > > > and maybe adding another RESET_DEVICE might be better.
> > > 
> > > So modern clients don't need this at all.  Old clients need something to
> > > stop device, but singling out reset is not a good idea: even if driver
> > > is unloaded, you need to do that.  snabbswitch (ab)uses RESET_OWNER for
> > > this, so maybe RESET_OWNER should be renamed DISABLE_ALL, to just make
> > > it stop all queues.
> > > 
> > > Does any dpdk version that was released respond to RESET_OWNER in some
> > > way? How exactly?
> > 
> > It just resets some states, such as closing call fd and kick fd,
> > unmapping buf from hugetlbfs from set_mem_table.
> 
> And is this in some dpdk version that has been released?

Yes, it's been there long time ago, and you can find them in recent
releases (say, v2.1).

However, QEMU just added RESET_OWNER since v2.4.0, and it's likely that
we will remove it since this version, v2.5.0.

> 
> > And, apparently we could do that on stop, too. So, from this pov, we
> > don't need RESET_OWNER.
> > 
> > > 
> > > > Patch 3 and 4 send SET_PROTOCOL_FEATURES at start, just like we send
> > > > SET_FEATURES.
> > > > 
> > > > Patch 5 send SET_VRING_ENABLE at start/stop
> > > > 
> > > > Michael, I intended to send it when MQ is negotiated as you 
> > > > suggested,
> > > > however, I found that VHOST_USER_PROTOCOL_F_MQ is defined in 
> > > > vhost-user.c,
> > > > which is not accessible to vhost.c.
> > > > 
> > > > Exporting it to vhost.h will resolve that, however, it's not a 
> > > > good
> > > > idea to move vhost user specific stuff to there. We could also 
> > > > introduce
> > > > another vhost callback to test whether MQ is negotiated: I just 
> > > > don't
> > > > think it's worthy.
> > > > 
> > > > Hence, here I just used a simple test: invoke 
> > > > set_vring_enable() just
> > > > when it is defined. Judging the callback self has another MQ 
> > > > check,
> > > > I guess it's okay.
> > > > 
> > > > And sorry that it took so long to send this version.
> > > 
> > > Hmm, you are saying everyone needs SET_VRING_ENABLE?
> > > Maybe we should make SET_VRING_ENABLE depend on protocol features then,
> > > and not MQ?
> > 
> > I'm thinking something same. Otherwise, there is still no way to inform
> > the backend (or client) when a vhost dev is stopped when MQ is disabled
> > (which is the default state).
> > 
> > So, let's assume all clients have protocol features enabled, and send
> > SET_VRING_ENABLE at start/stop? And if it does not, it's just like we
> > are back to QEMU v2.3, where no RESET_OWNER nor SET_VRING_ENABLE
> > messages are sent on stop: it worked before, and it should also work
> > now.
> 
> So maybe we should drop RESET_OWNER from stop then?

Yeah, agree with you. Sending reset meessage at stop doesn't make too
much sense.

> 
> > > 
> > > I applied patches 1 and 5 for now.
> > 
> > Do you have comment about patch 3 and 4? Should we set protocol features
> > just like we set features at start?
> > 
> > Thanks.
> > 
> > --yliu
> 
> We don't set the at start - we set them on connect.

Okay to me. I will drop them then.

--yliu

Re: [Qemu-devel] [PATCH v4 0/5] handle vhost reset/start/stop correctly

2015-11-12 Thread Yuanhan Liu

On Thu, Nov 12, 2015 at 03:33:41PM +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 11, 2015 at 09:24:36PM +0800, Yuanhan Liu wrote:
> > 
> > Patch 1 rename RESET_DEVICE back to RESET_OWNER
> > 
> > Patch 2 introduced a new function: vhost_net_reset(), which is invoked
> > when reset happens, say, driver is unloaded (unbinded).
> > 
> > Michael suggested to do that only when MQ is not negotiated.
> > However, reset happens no matter MQ is enabled or negotiated
> > or not, and we should give a sign to backend to reset some
> > device to a proper state after it happens.
> 
> I don't think it's needed: we set all state at start anyway.

Agree with that.

> 
> > Note that the message sent is still RESET_OWNER. It might not
> > be a good idea, but we could not simply rename it to RESET_DEVICE,
> > and maybe adding another RESET_DEVICE might be better.
> 
> So modern clients don't need this at all.  Old clients need something to
> stop device, but singling out reset is not a good idea: even if driver
> is unloaded, you need to do that.  snabbswitch (ab)uses RESET_OWNER for
> this, so maybe RESET_OWNER should be renamed DISABLE_ALL, to just make
> it stop all queues.
> 
> Does any dpdk version that was released respond to RESET_OWNER in some
> way? How exactly?

It just resets some states, such as closing call fd and kick fd,
unmapping buf from hugetlbfs from set_mem_table.

And, apparently we could do that on stop, too. So, from this pov, we
don't need RESET_OWNER.

> 
> > Patch 3 and 4 send SET_PROTOCOL_FEATURES at start, just like we send
> > SET_FEATURES.
> > 
> > Patch 5 send SET_VRING_ENABLE at start/stop
> > 
> > Michael, I intended to send it when MQ is negotiated as you 
> > suggested,
> > however, I found that VHOST_USER_PROTOCOL_F_MQ is defined in 
> > vhost-user.c,
> > which is not accessible to vhost.c.
> > 
> > Exporting it to vhost.h will resolve that, however, it's not a good
> > idea to move vhost user specific stuff to there. We could also 
> > introduce
> > another vhost callback to test whether MQ is negotiated: I just 
> > don't
> > think it's worthy.
> > 
> > Hence, here I just used a simple test: invoke set_vring_enable() 
> > just
> > when it is defined. Judging the callback self has another MQ check,
> > I guess it's okay.
> > 
> > And sorry that it took so long to send this version.
> 
> Hmm, you are saying everyone needs SET_VRING_ENABLE?
> Maybe we should make SET_VRING_ENABLE depend on protocol features then,
> and not MQ?

I'm thinking something same. Otherwise, there is still no way to inform
the backend (or client) when a vhost dev is stopped when MQ is disabled
(which is the default state).

So, let's assume all clients have protocol features enabled, and send
SET_VRING_ENABLE at start/stop? And if it does not, it's just like we
are back to QEMU v2.3, where no RESET_OWNER nor SET_VRING_ENABLE
messages are sent on stop: it worked before, and it should also work
now.

> 
> I applied patches 1 and 5 for now.

Do you have comment about patch 3 and 4? Should we set protocol features
just like we set features at start?

Thanks.

--yliu

[Qemu-devel] [PATCH v4 5/5] vhost: send SET_VRING_ENABLE at start/stop

2015-11-11 Thread Yuanhan Liu

Send SET_VRING_ENABLE at start/stop, to give the backend
an explicit sign of our state.

Signed-off-by: Yuanhan Liu 
---
 hw/virtio/vhost.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index be48511..0e956b5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1234,6 +1234,11 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 }
 }
 
+if (hdev->vhost_ops->vhost_set_vring_enable) {
+/* only enable first vq pair by default */
+hdev->vhost_ops->vhost_set_vring_enable(hdev, hdev->vq_index == 0);
+}
+
 return 0;
 fail_log:
 vhost_log_put(hdev, false);
@@ -1264,6 +1269,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice 
*vdev)
  hdev->vq_index + i);
 }
 
+if (hdev->vhost_ops->vhost_set_vring_enable) {
+hdev->vhost_ops->vhost_set_vring_enable(hdev, 0);
+}
+
 vhost_log_put(hdev, true);
 hdev->started = false;
 hdev->log = NULL;
-- 
1.9.0

[Qemu-devel] [PATCH v4 4/5] vhost: send SET_PROTOCOL_FEATURES at start

2015-11-11 Thread Yuanhan Liu

So that the backend can restore the protocol features after a reset.

Signed-off-by: Yuanhan Liu 
---
 hw/virtio/vhost.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index de29968..be48511 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1195,6 +1195,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 if (r < 0) {
 goto fail_features;
 }
+if (hdev->vhost_ops->vhost_set_protocol_features) {
+r = hdev->vhost_ops->vhost_set_protocol_features(hdev,
+hdev->protocol_features);
+if (r < 0) {
+goto fail_features;
+}
+}
+
 r = hdev->vhost_ops->vhost_set_mem_table(hdev, hdev->mem);
 if (r < 0) {
 r = -errno;
-- 
1.9.0

[Qemu-devel] [PATCH v4 1/5] vhost: rename RESET_DEVICE backto RESET_OWNER

2015-11-11 Thread Yuanhan Liu

This patch basically reverts commit d1f8b30e.

It turned out that it breaks stuff, so revert it:
http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html

CC: "Michael S. Tsirkin" 
Reported-by: Paolo Bonzini 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/virtio/vhost-backend.c   | 2 +-
 hw/virtio/vhost-user.c  | 6 +++---
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-bridge.c   | 6 +++---
 tests/vhost-user-test.c | 4 ++--
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index e0d71e2..d319715 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -255,10 +255,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_DEVICE
+ * VHOST_USER_RESET_OWNER
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_DEVICE
+  Equivalent ioctl: VHOST_RESET_OWNER
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 1d5f684..b734a60 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -156,7 +156,7 @@ static int vhost_kernel_set_owner(struct vhost_dev *dev)
 
 static int vhost_kernel_reset_device(struct vhost_dev *dev)
 {
-return vhost_kernel_call(dev, VHOST_RESET_DEVICE, NULL);
+return vhost_kernel_call(dev, VHOST_RESET_OWNER, NULL);
 }
 
 static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 83c84f1..4766f98 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -43,7 +43,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -157,7 +157,7 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 {
 switch (request) {
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 case VHOST_USER_SET_MEM_TABLE:
 case VHOST_USER_GET_QUEUE_NUM:
 return true;
@@ -486,7 +486,7 @@ static int vhost_user_set_owner(struct vhost_dev *dev)
 static int vhost_user_reset_device(struct vhost_dev *dev)
 {
 VhostUserMsg msg = {
-.request = VHOST_USER_RESET_DEVICE,
+.request = VHOST_USER_RESET_OWNER,
 .flags = VHOST_USER_VERSION,
 };
 
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 14a0160..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
index fa18ad5..864f69e 100644
--- a/tests/vhost-user-bridge.c
+++ b/tests/vhost-user-bridge.c
@@ -188,7 +188,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -274,7 +274,7 @@ static const char *vubr_request_str[] = {
 [VHOST_USER_GET_FEATURES]   =  "VHOST_USER_GET_FEATURES",
 [VHOST_USER_SET_FEATURES]   =  "VHOST_USER_SET_FEATURES",
 [VHOST_USER_SET_OWNER]  =  "VHOST_USER_SET_OWNER",
-[VHOST_USER_RESET_DEVICE]   =  "VHOST_USER_RESET_DEVICE",
+[VHOST_USER_RESET_OWNER]   =  "VHOST_USER_RESET_OWNER",
 [VHOST_USER_SET_MEM_TABLE]  =  "VHOST_USER_SET_MEM_TABLE",
 [VHOST_USER_SET_LOG_BASE]   =  "VHOST_USER_SET_LOG_BASE",
 [VHOST_USER_SET_LOG_FD] =  "VHOST_USER_SET_LOG_FD",
@@ -921,7 +921,7 @@ vubr_execute_request(VubrDev *dev, VhostUserMsg *vmsg)
 return vubr_set_features_exec(dev, vmsg);
 case VHOST_USER_SET_OWNER:
 return vubr_set_owner_exec(dev, vmsg);
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 return vubr_reset_device_exec(dev, vmsg);
 case VHOST_USER_SET_MEM_TABLE:
 return vubr_set_mem_table_exec(dev, vmsg);
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index b6dde75..66778c4 100644
--- a/t

[Qemu-devel] [PATCH v4 0/5] handle vhost reset/start/stop correctly

2015-11-11 Thread Yuanhan Liu


Patch 1 rename RESET_DEVICE back to RESET_OWNER

Patch 2 introduced a new function: vhost_net_reset(), which is invoked
when reset happens, say, driver is unloaded (unbinded).

Michael suggested to do that only when MQ is not negotiated.
However, reset happens no matter MQ is enabled or negotiated
or not, and we should give a sign to backend to reset some
device to a proper state after it happens.

Note that the message sent is still RESET_OWNER. It might not
be a good idea, but we could not simply rename it to RESET_DEVICE,
and maybe adding another RESET_DEVICE might be better.

Patch 3 and 4 send SET_PROTOCOL_FEATURES at start, just like we send
SET_FEATURES.

Patch 5 send SET_VRING_ENABLE at start/stop

Michael, I intended to send it when MQ is negotiated as you suggested,
however, I found that VHOST_USER_PROTOCOL_F_MQ is defined in 
vhost-user.c,
which is not accessible to vhost.c.

Exporting it to vhost.h will resolve that, however, it's not a good
idea to move vhost user specific stuff to there. We could also introduce
another vhost callback to test whether MQ is negotiated: I just don't
think it's worthy.

Hence, here I just used a simple test: invoke set_vring_enable() just
when it is defined. Judging the callback self has another MQ check,
I guess it's okay.

And sorry that it took so long to send this version.

---
Yuanhan Liu (5):
  vhost: rename RESET_DEVICE backto RESET_OWNER
  vhost: reset vhost net when virtio_net_reset happens
  vhost: introduce vhost_set/get_protocol_features callbacks
  vhost: send SET_PROTOCOL_FEATURES at start
  vhost: send SET_VRING_ENABLE at start/stop

 docs/specs/vhost-user.txt |  4 ++--
 hw/net/vhost_net.c| 20 ++--
 hw/net/virtio-net.c   | 14 ++
 hw/virtio/vhost-backend.c |  2 +-
 hw/virtio/vhost-user.c| 13 ++---
 hw/virtio/vhost.c | 17 +
 include/hw/virtio/vhost-backend.h |  6 ++
 include/net/vhost_net.h   |  1 +
 linux-headers/linux/vhost.h   |  2 +-
 tests/vhost-user-bridge.c |  6 +++---
 tests/vhost-user-test.c   |  4 ++--
 11 files changed, 71 insertions(+), 18 deletions(-)

-- 
1.9.0

[Qemu-devel] [PATCH v4 3/5] vhost: introduce vhost_set/get_protocol_features callbacks

2015-11-11 Thread Yuanhan Liu

Signed-off-by: Yuanhan Liu 
---
 hw/virtio/vhost-user.c| 7 +++
 include/hw/virtio/vhost-backend.h | 6 ++
 2 files changed, 13 insertions(+)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4766f98..2d8bdbd 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -471,6 +471,11 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_get_protocol_features(struct vhost_dev *dev, uint64_t 
*features)
+{
+return vhost_user_get_u64(dev, VHOST_USER_GET_PROTOCOL_FEATURES, features);
+}
+
 static int vhost_user_set_owner(struct vhost_dev *dev)
 {
 VhostUserMsg msg = {
@@ -616,6 +621,8 @@ const VhostOps user_ops = {
 .vhost_set_vring_call = vhost_user_set_vring_call,
 .vhost_set_features = vhost_user_set_features,
 .vhost_get_features = vhost_user_get_features,
+.vhost_set_protocol_features = vhost_user_set_protocol_features,
+.vhost_get_protocol_features = vhost_user_get_protocol_features,
 .vhost_set_owner = vhost_user_set_owner,
 .vhost_reset_device = vhost_user_reset_device,
 .vhost_get_vq_index = vhost_user_get_vq_index,
diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index c59cc81..7e705ce 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -62,6 +62,10 @@ typedef int (*vhost_set_features_op)(struct vhost_dev *dev,
  uint64_t features);
 typedef int (*vhost_get_features_op)(struct vhost_dev *dev,
  uint64_t *features);
+typedef int (*vhost_set_protocol_features_op)(struct vhost_dev *dev,
+  uint64_t features);
+typedef int (*vhost_get_protocol_features_op)(struct vhost_dev *dev,
+  uint64_t *features);
 typedef int (*vhost_set_owner_op)(struct vhost_dev *dev);
 typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
 typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
@@ -91,6 +95,8 @@ typedef struct VhostOps {
 vhost_set_vring_call_op vhost_set_vring_call;
 vhost_set_features_op vhost_set_features;
 vhost_get_features_op vhost_get_features;
+vhost_set_protocol_features_op vhost_set_protocol_features;
+vhost_get_protocol_features_op vhost_get_protocol_features;
 vhost_set_owner_op vhost_set_owner;
 vhost_reset_device_op vhost_reset_device;
 vhost_get_vq_index_op vhost_get_vq_index;
-- 
1.9.0

[Qemu-devel] [PATCH v4 2/5] vhost: reset vhost net when virtio_net_reset happens

2015-11-11 Thread Yuanhan Liu

When a virtio net driver is unloaded (unbind), virtio_net_reset happens.
For vhost net, we should also send a message (RESET_OWNER) to the backend
to do some proper reset settings.

Signed-off-by: Yuanhan Liu 
---
 hw/net/vhost_net.c  | 20 ++--
 hw/net/virtio-net.c | 14 ++
 include/net/vhost_net.h |  1 +
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d91b7b1..387f851 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -292,12 +292,6 @@ static void vhost_net_stop_one(struct vhost_net *net,
 int r = vhost_ops->vhost_net_set_backend(&net->dev, &file);
 assert(r >= 0);
 }
-} else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
-for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_reset_device(&net->dev);
-assert(r >= 0);
-}
 }
 if (net->nc->info->poll) {
 net->nc->info->poll(net->nc, true);
@@ -382,6 +376,16 @@ void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs,
 assert(vhost_net_set_vnet_endian(dev, ncs[0].peer, false) >= 0);
 }
 
+void vhost_net_reset(VirtIODevice *vdev)
+{
+VirtIONet *n = VIRTIO_NET(vdev);
+struct vhost_net *net = get_vhost_net(n->nic->ncs[0].peer);
+
+if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
+net->dev.vhost_ops->vhost_reset_device(&net->dev);
+}
+}
+
 void vhost_net_cleanup(struct vhost_net *net)
 {
 vhost_dev_cleanup(&net->dev);
@@ -469,6 +473,10 @@ void vhost_net_stop(VirtIODevice *dev,
 {
 }
 
+void vhost_net_reset(VirtIODevice *vdev)
+{
+}
+
 void vhost_net_cleanup(struct vhost_net *net)
 {
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index a877614..75472ba 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -318,10 +318,24 @@ static RxFilterInfo 
*virtio_net_query_rxfilter(NetClientState *nc)
 return info;
 }
 
+static void virtio_net_vhost_reset(VirtIONet *n)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(n);
+NetClientState *nc = qemu_get_queue(n->nic);
+
+if (!get_vhost_net(nc->peer)) {
+return;
+}
+
+vhost_net_reset(vdev);
+}
+
 static void virtio_net_reset(VirtIODevice *vdev)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
 
+virtio_net_vhost_reset(n);
+
 /* Reset back to compatibility mode */
 n->promisc = 1;
 n->allmulti = 0;
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 3389b41..d1c1331 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -18,6 +18,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options);
 
 int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, int total_queues);
 void vhost_net_stop(VirtIODevice *dev, NetClientState *ncs, int total_queues);
+void vhost_net_reset(VirtIODevice *dev);
 
 void vhost_net_cleanup(VHostNetState *net);
 
-- 
1.9.0

Re: [Qemu-devel] [PATCH] vhost-user: set link down when the char device is closed

2015-11-10 Thread Yuanhan Liu

Thanks.

Reviewed-by: Yuanhan Liu   

--yliu

On Wed, Nov 11, 2015 at 02:53:29PM +0800, Wen Congyang wrote:
> Signed-off-by: Wen Congyang 
> ---
>  net/vhost-user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index 0ebd7df..5071602 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -188,7 +188,7 @@ static void net_vhost_user_event(void *opaque, int event)
>  qmp_set_link(name, true, &err);
>  break;
>  case CHR_EVENT_CLOSED:
> -qmp_set_link(name, true, &err);
> +qmp_set_link(name, false, &err);
>  vhost_user_stop(queues, ncs);
>  break;
>  }
> -- 
> 2.5.0

Re: [Qemu-devel] [PULL 12/22] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-11-08 Thread Yuanhan Liu

On Fri, Nov 06, 2015 at 10:01:58AM +, Peter Maydell wrote:
> On 6 November 2015 at 01:34, Yuanhan Liu  wrote:
> > On Thu, Nov 05, 2015 at 11:42:15AM +, Peter Maydell wrote:
> >> On 3 October 2015 at 17:33, Michael S. Tsirkin  wrote:
> >> > On Fri, Oct 02, 2015 at 06:18:51PM +0200, Paolo Bonzini wrote:
> >> >>
> >> >>
> >> >> On 24/09/2015 15:20, Michael S. Tsirkin wrote:
> >> >> > From: Yuanhan Liu 
> >> >> >
> >> >> > Quote from Michael:
> >> >> >
> >> >> > We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.
> >> >>
> >> >> Where is the corresponding Linux patch for this?
> >> >>
> >> >> I would like to fetch the updated headers for KVM, and this is breaking
> >> >> it.  In fact, a patch that just renames the #define (without providing
> >> >> the old name for backwards compatibility) would be NACKed in upstream 
> >> >> Linux.
> >> >>
> >> >> Paolo
> >> >
> >> > Right. And it turns out this whole approach is wrong.  I intend to
> >> > revert this patch, and also drop the patch sending VHOST_RESET_OWNER on
> >> > device stop.
> >>
> >> This revert doesn't seem to have happened, I think, which means
> >> that this is one of the things which prevents a clean header-update
> >> against kvm/next. Could we get this fixed for rc0, please?
> >
> > My bad. I will fix it next week. What's the deadline for rc0 then?
> 
> rc0 is 12th November (http://wiki.qemu.org/Planning/2.5). You need
> to also allow time for the patch to be reviewed and possibly taken
> via somebody's tree.

Got it and thanks for the remind.

--yliu

Re: [Qemu-devel] [PULL 12/22] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-11-05 Thread Yuanhan Liu

On Thu, Nov 05, 2015 at 11:42:15AM +, Peter Maydell wrote:
> On 3 October 2015 at 17:33, Michael S. Tsirkin  wrote:
> > On Fri, Oct 02, 2015 at 06:18:51PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 24/09/2015 15:20, Michael S. Tsirkin wrote:
> >> > From: Yuanhan Liu 
> >> >
> >> > Quote from Michael:
> >> >
> >> > We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.
> >>
> >> Where is the corresponding Linux patch for this?
> >>
> >> I would like to fetch the updated headers for KVM, and this is breaking
> >> it.  In fact, a patch that just renames the #define (without providing
> >> the old name for backwards compatibility) would be NACKed in upstream 
> >> Linux.
> >>
> >> Paolo
> >
> > Right. And it turns out this whole approach is wrong.  I intend to
> > revert this patch, and also drop the patch sending VHOST_RESET_OWNER on
> > device stop.
> 
> This revert doesn't seem to have happened, I think, which means
> that this is one of the things which prevents a clean header-update
> against kvm/next. Could we get this fixed for rc0, please?

My bad. I will fix it next week. What's the deadline for rc0 then?

BTW, as Michael stated, a simple revert is not enough.

--yliu

[Qemu-devel] [PATCH v3 1/3] vhost: rename VHOST_RESET_DEVICE back to VHOST_RESET_OWNER

2015-10-23 Thread Yuanhan Liu

It turned out that it breaks stuff (VHOST_RESET_OWNER is not defined),
so revert it:

http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html

CC: "Michael S. Tsirkin" 
Reported-by: Paolo Bonzini 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/virtio/vhost-backend.c   | 2 +-
 hw/virtio/vhost-user.c  | 6 +++---
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-test.c | 4 ++--
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index e0d71e2..d319715 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -255,10 +255,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_DEVICE
+ * VHOST_USER_RESET_OWNER
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_DEVICE
+  Equivalent ioctl: VHOST_RESET_OWNER
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 1d5f684..b734a60 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -156,7 +156,7 @@ static int vhost_kernel_set_owner(struct vhost_dev *dev)
 
 static int vhost_kernel_reset_device(struct vhost_dev *dev)
 {
-return vhost_kernel_call(dev, VHOST_RESET_DEVICE, NULL);
+return vhost_kernel_call(dev, VHOST_RESET_OWNER, NULL);
 }
 
 static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 78442ba..7c532f4 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -43,7 +43,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -157,7 +157,7 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 {
 switch (request) {
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 case VHOST_USER_SET_MEM_TABLE:
 case VHOST_USER_GET_QUEUE_NUM:
 return true;
@@ -486,7 +486,7 @@ static int vhost_user_set_owner(struct vhost_dev *dev)
 static int vhost_user_reset_device(struct vhost_dev *dev)
 {
 VhostUserMsg msg = {
-.request = VHOST_USER_RESET_DEVICE,
+.request = VHOST_USER_RESET_OWNER,
 .flags = VHOST_USER_VERSION,
 };
 
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 14a0160..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index a74c934..d2fc048 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -57,7 +57,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -307,7 +307,7 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
 g_cond_signal(&s->data_cond);
 break;
 
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 s->fds_num = 0;
 break;
 
-- 
1.9.0

[Qemu-devel] [PATCH v3 2/3] doc: vhost-user: request naming fix

2015-10-23 Thread Yuanhan Liu

They are VHOST_USER_XXX instead of VHOST_XXX messages.

Also, add VHOST_USER_GET_QUEUE_NUM to the section that
requries replies.

Cc: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index d319715..0a5a346 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -112,20 +112,21 @@ The communication consists of master sending message 
requests and slave sending
 message replies. Most of the requests don't require replies. Here is a list of
 the ones that do:
 
- * VHOST_GET_FEATURES
- * VHOST_GET_PROTOCOL_FEATURES
- * VHOST_GET_VRING_BASE
- * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+ * VHOST_USER_GET_FEATURES
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+ * VHOST_USER_GET_VRING_BASE
+ * VHOST_USER_GET_QUEUE_NUM
+ * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
- * VHOST_SET_MEM_TABLE
- * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
- * VHOST_SET_LOG_FD
- * VHOST_SET_VRING_KICK
- * VHOST_SET_VRING_CALL
- * VHOST_SET_VRING_ERR
+ * VHOST_USER_SET_MEM_TABLE
+ * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+ * VHOST_USER_SET_LOG_FD
+ * VHOST_USER_SET_VRING_KICK
+ * VHOST_USER_SET_VRING_CALL
+ * VHOST_USER_SET_VRING_ERR
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
-- 
1.9.0

[Qemu-devel] [PATCH v3 0/3] vhost-user mq sequel works

2015-10-23 Thread Yuanhan Liu


Patch 1 is basically reverts the patch does the rename, as it
may break build for VHOST_RESET_OWNER is no longer defined.

As Michael stated, reverting it is not enough, we need
send it from the right place, and send SET_VRING_ENABLE
at start/stop, which is something I will do later.

Patch 2 is a minor fix or request naming in vhost-user spec

Patch 3 adds the vhost-user mq unit test

---
Yuanhan Liu (3):
  vhost: rename VHOST_RESET_DEVICE back to VHOST_RESET_OWNER
  doc: vhost-user: request naming fix
  vhost-user-test: add multiple queue test

 docs/specs/vhost-user.txt   | 25 +
 hw/virtio/vhost-backend.c   |  2 +-
 hw/virtio/vhost-user.c  |  6 +++---
 linux-headers/linux/vhost.h |  2 +-
 tests/vhost-user-test.c | 27 +++
 5 files changed, 41 insertions(+), 21 deletions(-)

-- 
1.9.0

[Qemu-devel] [PATCH v3 3/3] vhost-user-test: add multiple queue test

2015-10-23 Thread Yuanhan Liu

Setting VHOST_USER_PROTOCOL_F_MQ protocol feature bit to claim that we
support MQ feature, and simply assume we support 2 queue pairs at most.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Yuanhan Liu 
---
 tests/vhost-user-test.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index d2fc048..50e6c75 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -30,11 +30,17 @@
 #define HAVE_MONOTONIC_TIME
 #endif
 
+#define MAX_QUEUES  2
+
+#define __STR(x)#x
+#define STR(x)  __STR(x)
+
 #define QEMU_CMD_ACCEL  " -machine accel=tcg"
 #define QEMU_CMD_MEM" -m %d -object memory-backend-file,id=mem,size=%dM,"\
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=%s,path=%s"
-#define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=%s,vhostforce"
+#define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=%s,vhostforce," \
+"queues=" STR(MAX_QUEUES)
 #define QEMU_CMD_NET" -device virtio-net-pci,netdev=net0 "
 #define QEMU_CMD_ROM" -option-rom ../pc-bios/pxe-virtio.rom"
 
@@ -48,6 +54,7 @@
 #define VHOST_MEMORY_MAX_NREGIONS8
 
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
+#define VHOST_USER_PROTOCOL_F_MQ0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1
 
 #define VHOST_LOG_PAGE 0x1000
@@ -70,6 +77,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_GET_QUEUE_NUM = 17,
+VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -258,7 +267,8 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
 /* send back features to qemu */
 msg.flags |= VHOST_USER_REPLY_MASK;
 msg.size = sizeof(m.u64);
-msg.u64 = 1 << VHOST_USER_PROTOCOL_F_LOG_SHMFD;
+msg.u64 = (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |
+  (1ULL << VHOST_USER_PROTOCOL_F_MQ);
 p = (uint8_t *) &msg;
 qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
 break;
@@ -307,6 +317,15 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
 g_cond_signal(&s->data_cond);
 break;
 
+case VHOST_USER_GET_QUEUE_NUM:
+/* send back the number of queues we support (let it be 2) to qemu */
+msg.flags |= VHOST_USER_REPLY_MASK;
+msg.size = sizeof(m.u64);
+msg.u64 = MAX_QUEUES;
+p = (uint8_t *) &msg;
+qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
+break;
+
 case VHOST_USER_RESET_OWNER:
 s->fds_num = 0;
 break;
-- 
1.9.0

Re: [Qemu-devel] [PATCH v2 5/5] vhost: send VHOST_USER_SET_VRING_ENABLE at start/stop

2015-10-21 Thread Yuanhan Liu

On Wed, Oct 21, 2015 at 05:11:44PM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 21, 2015 at 09:43:16PM +0800, Yuanhan Liu wrote:
> > On Wed, Oct 21, 2015 at 01:39:11PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Oct 21, 2015 at 05:07:18PM +0800, Yuanhan Liu wrote:
> > > > Send VHOST_USER_SET_VRING_ENABLE at start/stop when multiple queue
> > > > is negotiated, to inform the backend that we are ready or not.
> > > 
> > > OK but that's only if MQ is set.
> > 
> > Maybe we could just call vhost_backend_set_vring_enable() unconditionally?
> > It's nil operation when MQ is not set.
> > 
> > > If now, we need to do
> > > RESET_OWNER followed by SET_OWNER.
> > 
> > Could you be more specific? Why sending RESET_OWNER followed by
> > SET_OWNER?
> > 
> > TBH, I'm a bit confused with RESET_OWNER now: what it does, and when is
> > supposed to send it :(
> 
> It's not well specified, but it does say it's analogous to RESET_OWNER
> in kernel. That one is well documented:
> 
> /* Set current process as the (exclusive) owner of this file descriptor.
>  * This must be called before any other vhost command.  Further calls to
>  * VHOST_OWNER_SET fail until VHOST_OWNER_RESET is called. */
> #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
> /* Give up ownership, and reset the device to default values.
>  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
> #define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)

Thanks, that helps (I think).

I recalled my old question, and rechecked your answer again:

Because we need to get the state from remote after stop.
RESET_OWNER discards that, so we can't resume the VM.

So, if I understand it correctly this time, you want to keep the
VM state at the backend side even after the VM is shut down, and
then we can resume it with those saved state

And why don't do that when MQ is enabled? I don't see it has anyting
to do with MQ.

> 
> So if we want just the reset part, we need to do VHOST_RESET_OWNER
> then redo everything that we did previously: VHOST_SET_OWNER
> SET_VRING_CALL etc etc.
> 
> > And, sending RESET_OWNER inside virtio_net_reset() also looks weird.
> > I made a quick try before sending this patchset, and the vhost-user
> > request dump doesn't look right to me: the message is sent after
> > vhost dev init (GET_FEATURES, GET_PROTOCOL_FEATURE, SET_OWNER, ...,
> > SET_VRING_CALL),  and before peer attach (SET_VRING_ENABLE) and
> > vhost_dev_start (SET_MEM_TABLE, ... SET_VRING_KICK ...):
> 
> Food for thought.

Aha...

> 
> > 
> > # start of a VM
> > 
> > VHOST_CONFIG: new virtio connection is 28
> > VHOST_CONFIG: new device, handle is 0
> > VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM
> > VHOST_CONFIG: read message VHOST_USER_SET_OWNER
> > VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
> > VHOST_CONFIG: vring call idx:0 file:29
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
> > VHOST_CONFIG: vring call idx:1 file:30
> > VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM
> > VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
> > ...
> > ...
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
> > VHOST_CONFIG: vring call idx:6 file:35
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
> > VHOST_CONFIG: vring call idx:7 file:36
> > 
> > ==> VHOST_CONFIG: read message VHOST_USER_RESET_OWNER
> > VHOST_CONFIG: read message VHOST_USER_RESET_OWNER
> > 
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
> > VHOST_CONFIG: set queue enable: 1 to qp idx: 0
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
> > VHOST_CONFIG: set queue enable: 0 to qp idx: 2
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
> > VHOST_CONFIG: set queue enable: 0 to qp idx: 4
> > VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
> > VHOST_CONFIG: set queue enable: 0 to qp idx: 6
> > VHOST_CONFIG

Re: [Qemu-devel] [PATCH v2 1/5] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-21 Thread Yuanhan Liu

On Wed, Oct 21, 2015 at 05:13:49PM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 21, 2015 at 09:04:17PM +0800, Yuanhan Liu wrote:
> > On Wed, Oct 21, 2015 at 01:40:59PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Oct 21, 2015 at 05:07:14PM +0800, Yuanhan Liu wrote:
> > > > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > > > 
> > > > It turned out that it breaks stuff, so revert it:
> > > > 
> > > > 
> > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > > > 
> > > > CC: "Michael S. Tsirkin" 
> > > > Reported-by: Paolo Bonzini 
> > > > Signed-off-by: Yuanhan Liu 
> > > 
> > > 
> > > Are these patches dependent on each other? If yes pls send a cover letter.
> > > If not pls don't send a series, post them separately.
> > 
> > Got it.
> > 
> > > And, pls Cc me on all vhost user patches.
> > 
> > Isn't I always doing that? :)
> > 
> > --yliu
> 
> You didn't for 5/5.

Oops, that obviously is a mistake. Sorry for that.

--yliu
> 
> > > 
> > > > ---
> > > >  docs/specs/vhost-user.txt   | 4 ++--
> > > >  hw/net/vhost_net.c  | 2 +-
> > > >  hw/virtio/vhost-user.c  | 8 
> > > >  linux-headers/linux/vhost.h | 2 +-
> > > >  tests/vhost-user-test.c | 2 +-
> > > >  5 files changed, 9 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > index 4eadad1..4bcd17d 100644
> > > > --- a/docs/specs/vhost-user.txt
> > > > +++ b/docs/specs/vhost-user.txt
> > > > @@ -211,10 +211,10 @@ Message types
> > > >as an owner of the session. This can be used on the Slave as a
> > > >"session start" flag.
> > > >  
> > > > - * VHOST_USER_RESET_DEVICE
> > > > + * VHOST_USER_RESET_OWNER
> > > >  
> > > >Id: 4
> > > > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > > > +  Equivalent ioctl: VHOST_RESET_OWNER
> > > >Master payload: N/A
> > > >  
> > > >Issued when a new connection is about to be closed. The Master 
> > > > will no
> > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > > index 2bce891..804f5c9 100644
> > > > --- a/hw/net/vhost_net.c
> > > > +++ b/hw/net/vhost_net.c
> > > > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net 
> > > > *net,
> > > >  } else if (net->nc->info->type == 
> > > > NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> > > >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) 
> > > > {
> > > >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > > > -int r = vhost_ops->vhost_call(&net->dev, 
> > > > VHOST_RESET_DEVICE,
> > > > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > > >NULL);
> > > >  assert(r >= 0);
> > > >  }
> > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > index b11c0d2..12a9104 100644
> > > > --- a/hw/virtio/vhost-user.c
> > > > +++ b/hw/virtio/vhost-user.c
> > > > @@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
> > > >  VHOST_USER_GET_FEATURES = 1,
> > > >  VHOST_USER_SET_FEATURES = 2,
> > > >  VHOST_USER_SET_OWNER = 3,
> > > > -VHOST_USER_RESET_DEVICE = 4,
> > > > +VHOST_USER_RESET_OWNER = 4,
> > > >  VHOST_USER_SET_MEM_TABLE = 5,
> > > >  VHOST_USER_SET_LOG_BASE = 6,
> > > >  VHOST_USER_SET_LOG_FD = 7,
> > > > @@ -102,7 +102,7 @@ static unsigned long int 
> > > > ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> > > >  VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> > > >  VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> > > >  VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
> > > > -VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
> > > > +VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
> > > >  VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
> >

Re: [Qemu-devel] [PATCH v2 5/5] vhost: send VHOST_USER_SET_VRING_ENABLE at start/stop

2015-10-21 Thread Yuanhan Liu

On Wed, Oct 21, 2015 at 01:39:11PM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 21, 2015 at 05:07:18PM +0800, Yuanhan Liu wrote:
> > Send VHOST_USER_SET_VRING_ENABLE at start/stop when multiple queue
> > is negotiated, to inform the backend that we are ready or not.
> 
> OK but that's only if MQ is set.

Maybe we could just call vhost_backend_set_vring_enable() unconditionally?
It's nil operation when MQ is not set.

> If now, we need to do
> RESET_OWNER followed by SET_OWNER.

Could you be more specific? Why sending RESET_OWNER followed by
SET_OWNER?

TBH, I'm a bit confused with RESET_OWNER now: what it does, and when is
supposed to send it :(

And, sending RESET_OWNER inside virtio_net_reset() also looks weird.
I made a quick try before sending this patchset, and the vhost-user
request dump doesn't look right to me: the message is sent after
vhost dev init (GET_FEATURES, GET_PROTOCOL_FEATURE, SET_OWNER, ...,
SET_VRING_CALL),  and before peer attach (SET_VRING_ENABLE) and
vhost_dev_start (SET_MEM_TABLE, ... SET_VRING_KICK ...):

# start of a VM

VHOST_CONFIG: new virtio connection is 28
VHOST_CONFIG: new device, handle is 0
VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES
VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM
VHOST_CONFIG: read message VHOST_USER_SET_OWNER
VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:0 file:29
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:1 file:30
VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
VHOST_CONFIG: read message VHOST_USER_GET_PROTOCOL_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_PROTOCOL_FEATURES
VHOST_CONFIG: read message VHOST_USER_GET_QUEUE_NUM
VHOST_CONFIG: read message VHOST_USER_GET_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
...
...
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:6 file:35
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:7 file:36

==> VHOST_CONFIG: read message VHOST_USER_RESET_OWNER
VHOST_CONFIG: read message VHOST_USER_RESET_OWNER

VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
VHOST_CONFIG: set queue enable: 1 to qp idx: 0
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
VHOST_CONFIG: set queue enable: 0 to qp idx: 2
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
VHOST_CONFIG: set queue enable: 0 to qp idx: 4
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
VHOST_CONFIG: set queue enable: 0 to qp idx: 6
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:0 file:29
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:1 file:30
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:2 file:31
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:3 file:32
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:4 file:33
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:5 file:34
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:6 file:35
VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
VHOST_CONFIG: vring call idx:7 file:36
VHOST_CONFIG: read message VHOST_USER_SET_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_MEM_TABLE
VHOST_CONFIG: mapped region 0 fd:37 to 0x2aaac000 sz:0xa off:0x0
VHOST_CONFIG: mapped region 1 fd:38 to 0x2aab sz:0x8000 
off:0xc
VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
VHOST_CONFIG: vring kick idx:0 file:39
VHOST_CONFIG: virtio is not ready for processing.
VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
VHOST_CONFIG: vring kick idx:1 file:40
VHOST_CONFIG: virtio is not ready for processing.
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
VHOST_CONFIG: set queue enable: 1 to qp idx: 0
VHOST_CONFIG: read message VHOST_USER_SET_FEATURES
VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
VHOST

Re: [Qemu-devel] [PATCH v2 1/5] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-21 Thread Yuanhan Liu

On Wed, Oct 21, 2015 at 01:40:59PM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 21, 2015 at 05:07:14PM +0800, Yuanhan Liu wrote:
> > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > 
> > It turned out that it breaks stuff, so revert it:
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > 
> > CC: "Michael S. Tsirkin" 
> > Reported-by: Paolo Bonzini 
> > Signed-off-by: Yuanhan Liu 
> 
> 
> Are these patches dependent on each other? If yes pls send a cover letter.
> If not pls don't send a series, post them separately.

Got it.

> And, pls Cc me on all vhost user patches.

Isn't I always doing that? :)

--yliu
> 
> > ---
> >  docs/specs/vhost-user.txt   | 4 ++--
> >  hw/net/vhost_net.c  | 2 +-
> >  hw/virtio/vhost-user.c  | 8 
> >  linux-headers/linux/vhost.h | 2 +-
> >  tests/vhost-user-test.c | 2 +-
> >  5 files changed, 9 insertions(+), 9 deletions(-)
> > 
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 4eadad1..4bcd17d 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -211,10 +211,10 @@ Message types
> >as an owner of the session. This can be used on the Slave as a
> >"session start" flag.
> >  
> > - * VHOST_USER_RESET_DEVICE
> > + * VHOST_USER_RESET_OWNER
> >  
> >Id: 4
> > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > +  Equivalent ioctl: VHOST_RESET_OWNER
> >Master payload: N/A
> >  
> >Issued when a new connection is about to be closed. The Master will 
> > no
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index 2bce891..804f5c9 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
> >  } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
> >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > -int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
> > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> >NULL);
> >  assert(r >= 0);
> >  }
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index b11c0d2..12a9104 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_GET_FEATURES = 1,
> >  VHOST_USER_SET_FEATURES = 2,
> >  VHOST_USER_SET_OWNER = 3,
> > -VHOST_USER_RESET_DEVICE = 4,
> > +VHOST_USER_RESET_OWNER = 4,
> >  VHOST_USER_SET_MEM_TABLE = 5,
> >  VHOST_USER_SET_LOG_BASE = 6,
> >  VHOST_USER_SET_LOG_FD = 7,
> > @@ -102,7 +102,7 @@ static unsigned long int 
> > ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> >  VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> >  VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> >  VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
> > -VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
> > +VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
> >  VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
> >  VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
> >  VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
> > @@ -192,7 +192,7 @@ static bool 
> > vhost_user_one_time_request(VhostUserRequest request)
> >  {
> >  switch (request) {
> >  case VHOST_USER_SET_OWNER:
> > -case VHOST_USER_RESET_DEVICE:
> > +case VHOST_USER_RESET_OWNER:
> >  case VHOST_USER_SET_MEM_TABLE:
> >  case VHOST_USER_GET_QUEUE_NUM:
> >  return true;
> > @@ -249,7 +249,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
> > unsigned long int request,
> >  break;
> >  
> >  case VHOST_USER_SET_OWNER:
> > -case VHOST_USER_RESET_DEVICE:
> > +case VHOST_USER_RESET_OWNER:
> >  break;
> >  
> >  case VHOST_USER_SET_MEM_TABLE:
> > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > index 14a0160..ead86db 100644
> > --- a/linux-headers/linux/vhost.h
> > +++ b/linux-headers/linux/vhost.h
> > @@ -78,7 +78,7 @@ struct vhost_memory {
> >  #define

Re: [Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-21 Thread Yuanhan Liu

On Wed, Oct 21, 2015 at 12:15:52PM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 21, 2015 at 04:55:16PM +0800, Yuanhan Liu wrote:
> > On Fri, Oct 16, 2015 at 10:47:49AM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Oct 16, 2015 at 03:32:56PM +0800, Yuanhan Liu wrote:
> > > > On Fri, Oct 16, 2015 at 10:21:40AM +0300, Michael S. Tsirkin wrote:
> > > > > On Fri, Oct 16, 2015 at 03:18:59PM +0800, Yuanhan Liu wrote:
> > > > > > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > > > > > 
> > > > > > It turned out that it breaks stuff, so revert it:
> > > > > > 
> > > > > > 
> > > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > > > > > 
> > > > > > CC: "Michael S. Tsirkin" 
> > > > > > Reported-by: Paolo Bonzini 
> > > > > > Signed-off-by: Yuanhan Liu 
> > > > > 
> > > > > OK but that's not enough. We need to also
> > > > > - drop patch sending this on stop
> > > > 
> > > > I could do that. But may I know why we have to drop it on stop?
> > > 
> > > Because we need to get the state from remote after stop.
> > > RESET_OWNER discards that, so we can't resume the VM.
> > > 
> > > > If we don't send it on stop, when will we send it? The termination
> > > > of QEMU?
> > > 
> > > If mq is negotiated, we don't need it.  If mq isn't negotiated, maybe we
> > > should send it on reset/driver unload.
> > 
> > Sorry, what place do you mean exactly?
> 
> There's no such code - you'll have to add
> a callback into vhost from virtio.

Say, at virtio_net_reset()?

--yliu
> > 
> > > It might be better than
> > > corrupting guest memory. And if we do, maybe we should keep
> > > the RESET_DEVICE name.
> > > 
> > > > > - send VHOST_USER_SET_VRING_ENABLE on stop and start instead
> > > > 
> > > > That indeed looks more reasonable. And actually, Huawei asked whether
> > > > there is a message from QEMU to tell us that a vring is ready in
> > > > our weekly meeting this morning. And here it is :)
> > > > 
> > > > --yliu
> > > 
> > > Unfortunately that's only there if MQ is negotiated.
> > > 
> > > If MQ is not negotiated, then I think you can get away with
> > > starting processing after you get the 1st kick.
> > > 
> > > 
> > > 
> > > > > 
> > > > > > ---
> > > > > >  docs/specs/vhost-user.txt   | 4 ++--
> > > > > >  hw/net/vhost_net.c  | 2 +-
> > > > > >  hw/virtio/vhost-user.c  | 8 
> > > > > >  linux-headers/linux/vhost.h | 2 +-
> > > > > >  tests/vhost-user-test.c | 2 +-
> > > > > >  5 files changed, 9 insertions(+), 9 deletions(-)
> > > > > > 
> > > > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > > > index 4eadad1..4bcd17d 100644
> > > > > > --- a/docs/specs/vhost-user.txt
> > > > > > +++ b/docs/specs/vhost-user.txt
> > > > > > @@ -211,10 +211,10 @@ Message types
> > > > > >as an owner of the session. This can be used on the Slave as 
> > > > > > a
> > > > > >"session start" flag.
> > > > > >  
> > > > > > - * VHOST_USER_RESET_DEVICE
> > > > > > + * VHOST_USER_RESET_OWNER
> > > > > >  
> > > > > >Id: 4
> > > > > > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > > > > > +  Equivalent ioctl: VHOST_RESET_OWNER
> > > > > >Master payload: N/A
> > > > > >  
> > > > > >Issued when a new connection is about to be closed. The 
> > > > > > Master will no
> > > > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > > > > index 2bce891..804f5c9 100644
> > > > > > --- a/hw/net/vhost_net.c
> > > > > > +++ b/hw/net/vhost_net.c
> > > > > > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net 
> > > > > > *net,
> > > > > >  } else if (net->nc-&g

Re: [Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-21 Thread Yuanhan Liu

On Fri, Oct 16, 2015 at 10:47:49AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 16, 2015 at 03:32:56PM +0800, Yuanhan Liu wrote:
> > On Fri, Oct 16, 2015 at 10:21:40AM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Oct 16, 2015 at 03:18:59PM +0800, Yuanhan Liu wrote:
> > > > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > > > 
> > > > It turned out that it breaks stuff, so revert it:
> > > > 
> > > > 
> > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > > > 
> > > > CC: "Michael S. Tsirkin" 
> > > > Reported-by: Paolo Bonzini 
> > > > Signed-off-by: Yuanhan Liu 
> > > 
> > > OK but that's not enough. We need to also
> > > - drop patch sending this on stop
> > 
> > I could do that. But may I know why we have to drop it on stop?
> 
> Because we need to get the state from remote after stop.
> RESET_OWNER discards that, so we can't resume the VM.
> 
> > If we don't send it on stop, when will we send it? The termination
> > of QEMU?
> 
> If mq is negotiated, we don't need it.  If mq isn't negotiated, maybe we
> should send it on reset/driver unload.

Sorry, what place do you mean exactly?

--yliu

> It might be better than
> corrupting guest memory. And if we do, maybe we should keep
> the RESET_DEVICE name.
> 
> > > - send VHOST_USER_SET_VRING_ENABLE on stop and start instead
> > 
> > That indeed looks more reasonable. And actually, Huawei asked whether
> > there is a message from QEMU to tell us that a vring is ready in
> > our weekly meeting this morning. And here it is :)
> > 
> > --yliu
> 
> Unfortunately that's only there if MQ is negotiated.
> 
> If MQ is not negotiated, then I think you can get away with
> starting processing after you get the 1st kick.
> 
> 
> 
> > > 
> > > > ---
> > > >  docs/specs/vhost-user.txt   | 4 ++--
> > > >  hw/net/vhost_net.c  | 2 +-
> > > >  hw/virtio/vhost-user.c  | 8 
> > > >  linux-headers/linux/vhost.h | 2 +-
> > > >  tests/vhost-user-test.c | 2 +-
> > > >  5 files changed, 9 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > index 4eadad1..4bcd17d 100644
> > > > --- a/docs/specs/vhost-user.txt
> > > > +++ b/docs/specs/vhost-user.txt
> > > > @@ -211,10 +211,10 @@ Message types
> > > >as an owner of the session. This can be used on the Slave as a
> > > >"session start" flag.
> > > >  
> > > > - * VHOST_USER_RESET_DEVICE
> > > > + * VHOST_USER_RESET_OWNER
> > > >  
> > > >Id: 4
> > > > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > > > +  Equivalent ioctl: VHOST_RESET_OWNER
> > > >Master payload: N/A
> > > >  
> > > >Issued when a new connection is about to be closed. The Master 
> > > > will no
> > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > > index 2bce891..804f5c9 100644
> > > > --- a/hw/net/vhost_net.c
> > > > +++ b/hw/net/vhost_net.c
> > > > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net 
> > > > *net,
> > > >  } else if (net->nc->info->type == 
> > > > NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> > > >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) 
> > > > {
> > > >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > > > -int r = vhost_ops->vhost_call(&net->dev, 
> > > > VHOST_RESET_DEVICE,
> > > > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > > >NULL);
> > > >  assert(r >= 0);
> > > >  }
> > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > index b11c0d2..12a9104 100644
> > > > --- a/hw/virtio/vhost-user.c
> > > > +++ b/hw/virtio/vhost-user.c
> > > > @@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
> > > >  VHOST_USER_GET_FEATURES = 1,
> > > >  VHOST_USER_SET_FEATURES = 2,
> > > >  VHOST_USER_SET_OWNER = 3,
> > > > -VHOST_USER_RESET_DEVICE =

[Qemu-devel] [PATCH v2 5/5] vhost: send VHOST_USER_SET_VRING_ENABLE at start/stop

2015-10-21 Thread Yuanhan Liu

Send VHOST_USER_SET_VRING_ENABLE at start/stop when multiple queue
is negotiated, to inform the backend that we are ready or not.

And exclude VHOST_USER_GET_QUEUE_NUM as one time request, as we need
to get max_queues for each vhost_dev.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 hw/virtio/vhost-user.c |  1 -
 hw/virtio/vhost.c  | 18 ++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 12a9104..6532a73 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -194,7 +194,6 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 case VHOST_USER_SET_OWNER:
 case VHOST_USER_RESET_OWNER:
 case VHOST_USER_SET_MEM_TABLE:
-case VHOST_USER_GET_QUEUE_NUM:
 return true;
 default:
 return false;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index c0ed5b2..54a4633 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1146,6 +1146,19 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 }
 }
 
+/*
+ * Send VHOST_USER_SET_VRING_ENABLE message when multiple queue
+ * is negotiated to inform the back end that we are ready.
+ *
+ * Only set enable to 1 for first queue pair, as we enable one
+ * queue pair by default.
+ */
+if (hdev->max_queues > 1 &&
+hdev->vhost_ops->vhost_backend_set_vring_enable) {
+hdev->vhost_ops->vhost_backend_set_vring_enable(hdev,
+hdev->vq_index == 0);
+}
+
 return 0;
 fail_log:
 vhost_log_put(hdev, false);
@@ -1180,5 +1193,10 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 hdev->started = false;
 hdev->log = NULL;
 hdev->log_size = 0;
+
+if (hdev->max_queues > 1 &&
+hdev->vhost_ops->vhost_backend_set_vring_enable) {
+hdev->vhost_ops->vhost_backend_set_vring_enable(hdev, 0);
+}
 }
 
-- 
1.9.0

[Qemu-devel] [PATCH v2 4/5] Revert "vhost-user: Send VHOST_RESET_OWNER on vhost stop"

2015-10-21 Thread Yuanhan Liu

Don't send VHOST_RESET_OWNER, for as Michael stated:

Because we need to get the state from remote after stop.
RESET_OWNER discards that, so we can't resume the VM.

This reverts commit 294ce717e0f212ed0763307f3eab72b4a1bdf4d0.

Cc: Luke Gorrie 
Cc: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 hw/net/vhost_net.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 804f5c9..95da5f8 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -293,13 +293,6 @@ static void vhost_net_stop_one(struct vhost_net *net,
   &file);
 assert(r >= 0);
 }
-} else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
-for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
-  NULL);
-assert(r >= 0);
-}
 }
 if (net->nc->info->poll) {
 net->nc->info->poll(net->nc, true);
-- 
1.9.0

[Qemu-devel] [PATCH v2 1/5] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-21 Thread Yuanhan Liu

This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.

It turned out that it breaks stuff, so revert it:

http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html

CC: "Michael S. Tsirkin" 
Reported-by: Paolo Bonzini 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/net/vhost_net.c  | 2 +-
 hw/virtio/vhost-user.c  | 8 
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-test.c | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 4eadad1..4bcd17d 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -211,10 +211,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_DEVICE
+ * VHOST_USER_RESET_OWNER
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_DEVICE
+  Equivalent ioctl: VHOST_RESET_OWNER
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 2bce891..804f5c9 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
 for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
+int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
   NULL);
 assert(r >= 0);
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index b11c0d2..12a9104 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -102,7 +102,7 @@ static unsigned long int 
ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
 VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
 VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
 VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
-VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
+VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
 VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
 VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
 VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
@@ -192,7 +192,7 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 {
 switch (request) {
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 case VHOST_USER_SET_MEM_TABLE:
 case VHOST_USER_GET_QUEUE_NUM:
 return true;
@@ -249,7 +249,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 break;
 
 case VHOST_USER_SET_MEM_TABLE:
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 14a0160..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 56df5cc..f181391 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -53,7 +53,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
-- 
1.9.0

[Qemu-devel] [PATCH v2 2/5] doc: vhost-user: request naming fix

2015-10-21 Thread Yuanhan Liu

They are VHOST_USER_XXX instead of VHOST_XXX messages.

Also, add VHOST_USER_GET_QUEUE_NUM to the section that
requries replies.

Cc: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 4bcd17d..b23d88f 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -112,18 +112,19 @@ The communication consists of master sending message 
requests and slave sending
 message replies. Most of the requests don't require replies. Here is a list of
 the ones that do:
 
- * VHOST_GET_FEATURES
- * VHOST_GET_PROTOCOL_FEATURES
- * VHOST_GET_VRING_BASE
+ * VHOST_USER_GET_FEATURES
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+ * VHOST_USER_GET_VRING_BASE
+ * VHOST_USER_GET_QUEUE_NUM
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
- * VHOST_SET_MEM_TABLE
- * VHOST_SET_LOG_FD
- * VHOST_SET_VRING_KICK
- * VHOST_SET_VRING_CALL
- * VHOST_SET_VRING_ERR
+ * VHOST_USER_SET_MEM_TABLE
+ * VHOST_USER_SET_LOG_FD
+ * VHOST_USER_SET_VRING_KICK
+ * VHOST_USER_SET_VRING_CALL
+ * VHOST_USER_SET_VRING_ERR
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
-- 
1.9.0

[Qemu-devel] [PATCH v2 3/5] vhost-user-test: add multiple queue test

2015-10-21 Thread Yuanhan Liu

Setting VHOST_USER_PROTOCOL_F_MQ protocol feature bit to claim that we
support MQ feature, and simply assume we support 0xff queue pairs at most.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Yuanhan Liu 

---
v2: use macro to define the max queues we support
---
 tests/vhost-user-test.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index f181391..8162e64 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -29,11 +29,14 @@
 #define HAVE_MONOTONIC_TIME
 #endif
 
+#define MAX_QUEUS   0xff
+
 #define QEMU_CMD_ACCEL  " -machine accel=tcg"
 #define QEMU_CMD_MEM" -m 512 -object 
memory-backend-file,id=mem,size=512M,"\
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=chr0,path=%s"
-#define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=chr0,vhostforce"
+#define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=chr0,vhostforce," 
\
+"queues=2"
 #define QEMU_CMD_NET" -device virtio-net-pci,netdev=net0 "
 #define QEMU_CMD_ROM" -option-rom ../pc-bios/pxe-virtio.rom"
 
@@ -48,6 +51,8 @@
 
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+#define VHOST_USER_PROTOCOL_F_MQ 0
+
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
 VHOST_USER_GET_FEATURES = 1,
@@ -66,6 +71,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_GET_QUEUE_NUM = 17,
+VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -232,7 +239,7 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
 /* send back features to qemu */
 msg.flags |= VHOST_USER_REPLY_MASK;
 msg.size = sizeof(m.u64);
-msg.u64 = 0;
+msg.u64 = (1ULL << VHOST_USER_PROTOCOL_F_MQ);
 p = (uint8_t *) &msg;
 qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
 break;
@@ -266,6 +273,16 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
  */
 qemu_set_nonblock(fd);
 break;
+
+case VHOST_USER_GET_QUEUE_NUM:
+ /* send back the number of queues we support (let it be 2) to qemu */
+msg.flags |= VHOST_USER_REPLY_MASK;
+msg.size = sizeof(m.u64);
+msg.u64 = MAX_QUEUS;
+p = (uint8_t *) &msg;
+qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
+break;
+
 default:
 break;
 }
-- 
1.9.0

Re: [Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-16 Thread Yuanhan Liu

On Fri, Oct 16, 2015 at 11:15:23AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 16, 2015 at 04:04:09PM +0800, Yuanhan Liu wrote:
> > On Fri, Oct 16, 2015 at 10:47:49AM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Oct 16, 2015 at 03:32:56PM +0800, Yuanhan Liu wrote:
> > > > On Fri, Oct 16, 2015 at 10:21:40AM +0300, Michael S. Tsirkin wrote:
> > > > > On Fri, Oct 16, 2015 at 03:18:59PM +0800, Yuanhan Liu wrote:
> > > > > > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > > > > > 
> > > > > > It turned out that it breaks stuff, so revert it:
> > > > > > 
> > > > > > 
> > > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > > > > > 
> > > > > > CC: "Michael S. Tsirkin" 
> > > > > > Reported-by: Paolo Bonzini 
> > > > > > Signed-off-by: Yuanhan Liu 
> > > > > 
> > > > > OK but that's not enough. We need to also
> > > > > - drop patch sending this on stop
> > > > 
> > > > I could do that. But may I know why we have to drop it on stop?
> > > 
> > > Because we need to get the state from remote after stop.
> > > RESET_OWNER discards that, so we can't resume the VM.
> > 
> > Got it, and thanks for the explanation.
> > 
> > > > If we don't send it on stop, when will we send it? The termination
> > > > of QEMU?
> > > 
> > > If mq is negotiated, we don't need it.  If mq isn't negotiated, maybe we
> > > should send it on reset/driver unload. It might be better than
> > > corrupting guest memory. And if we do, maybe we should keep
> > > the RESET_DEVICE name.
> > 
> > Maybe we should just ignore it now, before I (we) have clear clue
> > on how to handle it correctly.
> 
> Well if MQ is negotiated, then skipping this is clearly
> the right thing to do.

Let me put a bit more thoughts on this when get time.

> > > 
> > > > > - send VHOST_USER_SET_VRING_ENABLE on stop and start instead
> > > > 
> > > > That indeed looks more reasonable. And actually, Huawei asked whether
> > > > there is a message from QEMU to tell us that a vring is ready in
> > > > our weekly meeting this morning. And here it is :)
> > > > 
> > > > --yliu
> > > 
> > > Unfortunately that's only there if MQ is negotiated.
> > 
> > Yes.
> > 
> > > 
> > > If MQ is not negotiated, then I think you can get away with
> > > starting processing after you get the 1st kick.
> > 
> > We mark a vring is ready by checking the callfd and kickfd:
> > if both are valid (> 0), then it's ready.
> > 
> > --yliu
> 
> No, that's not right. If MQ is not there, you must not look at vring
> until you get at least 1 kick. In other words, until kickfd is readable.
> Whether callfd is set should not matter at all.

That makes sense to me. Huawei, what do you think?

--yliu

> If MQ is there, you should just go by VHOST_USER_SET_VRING_ENABLE.
> 
> 
> > > 
> > > 
> > > > > 
> > > > > > ---
> > > > > >  docs/specs/vhost-user.txt   | 4 ++--
> > > > > >  hw/net/vhost_net.c  | 2 +-
> > > > > >  hw/virtio/vhost-user.c  | 8 
> > > > > >  linux-headers/linux/vhost.h | 2 +-
> > > > > >  tests/vhost-user-test.c | 2 +-
> > > > > >  5 files changed, 9 insertions(+), 9 deletions(-)
> > > > > > 
> > > > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > > > index 4eadad1..4bcd17d 100644
> > > > > > --- a/docs/specs/vhost-user.txt
> > > > > > +++ b/docs/specs/vhost-user.txt
> > > > > > @@ -211,10 +211,10 @@ Message types
> > > > > >as an owner of the session. This can be used on the Slave as 
> > > > > > a
> > > > > >"session start" flag.
> > > > > >  
> > > > > > - * VHOST_USER_RESET_DEVICE
> > > > > > + * VHOST_USER_RESET_OWNER
> > > > > >  
> > > > > >Id: 4
> > > > > > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > > > > > +  Equivalent ioctl: VHOST_RESET_OWNER
> > > > > >

Re: [Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-16 Thread Yuanhan Liu

On Fri, Oct 16, 2015 at 10:47:49AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 16, 2015 at 03:32:56PM +0800, Yuanhan Liu wrote:
> > On Fri, Oct 16, 2015 at 10:21:40AM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Oct 16, 2015 at 03:18:59PM +0800, Yuanhan Liu wrote:
> > > > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > > > 
> > > > It turned out that it breaks stuff, so revert it:
> > > > 
> > > > 
> > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > > > 
> > > > CC: "Michael S. Tsirkin" 
> > > > Reported-by: Paolo Bonzini 
> > > > Signed-off-by: Yuanhan Liu 
> > > 
> > > OK but that's not enough. We need to also
> > > - drop patch sending this on stop
> > 
> > I could do that. But may I know why we have to drop it on stop?
> 
> Because we need to get the state from remote after stop.
> RESET_OWNER discards that, so we can't resume the VM.

Got it, and thanks for the explanation.

> > If we don't send it on stop, when will we send it? The termination
> > of QEMU?
> 
> If mq is negotiated, we don't need it.  If mq isn't negotiated, maybe we
> should send it on reset/driver unload. It might be better than
> corrupting guest memory. And if we do, maybe we should keep
> the RESET_DEVICE name.

Maybe we should just ignore it now, before I (we) have clear clue
on how to handle it correctly.

> 
> > > - send VHOST_USER_SET_VRING_ENABLE on stop and start instead
> > 
> > That indeed looks more reasonable. And actually, Huawei asked whether
> > there is a message from QEMU to tell us that a vring is ready in
> > our weekly meeting this morning. And here it is :)
> > 
> > --yliu
> 
> Unfortunately that's only there if MQ is negotiated.

Yes.

> 
> If MQ is not negotiated, then I think you can get away with
> starting processing after you get the 1st kick.

We mark a vring is ready by checking the callfd and kickfd:
if both are valid (> 0), then it's ready.

--yliu

> 
> 
> > > 
> > > > ---
> > > >  docs/specs/vhost-user.txt   | 4 ++--
> > > >  hw/net/vhost_net.c  | 2 +-
> > > >  hw/virtio/vhost-user.c  | 8 
> > > >  linux-headers/linux/vhost.h | 2 +-
> > > >  tests/vhost-user-test.c | 2 +-
> > > >  5 files changed, 9 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > > > index 4eadad1..4bcd17d 100644
> > > > --- a/docs/specs/vhost-user.txt
> > > > +++ b/docs/specs/vhost-user.txt
> > > > @@ -211,10 +211,10 @@ Message types
> > > >as an owner of the session. This can be used on the Slave as a
> > > >"session start" flag.
> > > >  
> > > > - * VHOST_USER_RESET_DEVICE
> > > > + * VHOST_USER_RESET_OWNER
> > > >  
> > > >Id: 4
> > > > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > > > +  Equivalent ioctl: VHOST_RESET_OWNER
> > > >Master payload: N/A
> > > >  
> > > >Issued when a new connection is about to be closed. The Master 
> > > > will no
> > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > > index 2bce891..804f5c9 100644
> > > > --- a/hw/net/vhost_net.c
> > > > +++ b/hw/net/vhost_net.c
> > > > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net 
> > > > *net,
> > > >  } else if (net->nc->info->type == 
> > > > NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> > > >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) 
> > > > {
> > > >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > > > -int r = vhost_ops->vhost_call(&net->dev, 
> > > > VHOST_RESET_DEVICE,
> > > > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > > >NULL);
> > > >  assert(r >= 0);
> > > >  }
> > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > index b11c0d2..12a9104 100644
> > > > --- a/hw/virtio/vhost-user.c
> > > > +++ b/hw/virtio/vhost-user.c
> > > > @@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
>

Re: [Qemu-devel] [PATCH 3/3] vhost-user-test: add multiple queue test

2015-10-16 Thread Yuanhan Liu

On Fri, Oct 16, 2015 at 10:20:08AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 16, 2015 at 03:19:01PM +0800, Yuanhan Liu wrote:
> > Setting VHOST_USER_PROTOCOL_F_MQ protocol feature bit to claim that we
> > support MQ feature, and simply assume we support 2 queue pairs at most.
> 
> Well - let's add a macro, or something?

Yeah, a macro is better.

--yliu
> 
> > 
> > Cc: Michael S. Tsirkin 
> > Cc: Jason Wang 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  tests/vhost-user-test.c | 18 --
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
> > index f181391..f0aa36f 100644
> > --- a/tests/vhost-user-test.c
> > +++ b/tests/vhost-user-test.c
> > @@ -33,7 +33,7 @@
> >  #define QEMU_CMD_MEM" -m 512 -object 
> > memory-backend-file,id=mem,size=512M,"\
> >  "mem-path=%s,share=on -numa node,memdev=mem"
> >  #define QEMU_CMD_CHR" -chardev socket,id=chr0,path=%s"
> > -#define QEMU_CMD_NETDEV " -netdev 
> > vhost-user,id=net0,chardev=chr0,vhostforce"
> > +#define QEMU_CMD_NETDEV " -netdev 
> > vhost-user,id=net0,chardev=chr0,vhostforce,queues=2"
> >  #define QEMU_CMD_NET" -device virtio-net-pci,netdev=net0 "
> >  #define QEMU_CMD_ROM" -option-rom ../pc-bios/pxe-virtio.rom"
> >  
> > @@ -48,6 +48,8 @@
> >  
> >  #define VHOST_USER_F_PROTOCOL_FEATURES 30
> >  
> > +#define VHOST_USER_PROTOCOL_F_MQ 0
> > +
> >  typedef enum VhostUserRequest {
> >  VHOST_USER_NONE = 0,
> >  VHOST_USER_GET_FEATURES = 1,
> > @@ -66,6 +68,8 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_SET_VRING_ERR = 14,
> >  VHOST_USER_GET_PROTOCOL_FEATURES = 15,
> >  VHOST_USER_SET_PROTOCOL_FEATURES = 16,
> > +VHOST_USER_GET_QUEUE_NUM = 17,
> > +VHOST_USER_SET_VRING_ENABLE = 18,
> >  VHOST_USER_MAX
> >  } VhostUserRequest;
> >  
> > @@ -232,7 +236,7 @@ static void chr_read(void *opaque, const uint8_t *buf, 
> > int size)
> >  /* send back features to qemu */
> >  msg.flags |= VHOST_USER_REPLY_MASK;
> >  msg.size = sizeof(m.u64);
> > -msg.u64 = 0;
> > +msg.u64 = (1ULL << VHOST_USER_PROTOCOL_F_MQ);
> >  p = (uint8_t *) &msg;
> >  qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
> >  break;
> > @@ -266,6 +270,16 @@ static void chr_read(void *opaque, const uint8_t *buf, 
> > int size)
> >   */
> >  qemu_set_nonblock(fd);
> >  break;
> > +
> > +case VHOST_USER_GET_QUEUE_NUM:
> > + /* send back the number of queues we support (let it be 2) to 
> > qemu */
> > +msg.flags |= VHOST_USER_REPLY_MASK;
> > +msg.size = sizeof(m.u64);
> > +msg.u64 = 2;
> > +p = (uint8_t *) &msg;
> > +qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
> > +break;
> > +
> >  default:
> >  break;
> >  }
> > -- 
> > 1.9.0

Re: [Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-16 Thread Yuanhan Liu

On Fri, Oct 16, 2015 at 10:21:40AM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 16, 2015 at 03:18:59PM +0800, Yuanhan Liu wrote:
> > This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.
> > 
> > It turned out that it breaks stuff, so revert it:
> > 
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html
> > 
> > CC: "Michael S. Tsirkin" 
> > Reported-by: Paolo Bonzini 
> > Signed-off-by: Yuanhan Liu 
> 
> OK but that's not enough. We need to also
> - drop patch sending this on stop

I could do that. But may I know why we have to drop it on stop?
If we don't send it on stop, when will we send it? The termination
of QEMU?
 
> - send VHOST_USER_SET_VRING_ENABLE on stop and start instead

That indeed looks more reasonable. And actually, Huawei asked whether
there is a message from QEMU to tell us that a vring is ready in
our weekly meeting this morning. And here it is :)

--yliu
> 
> > ---
> >  docs/specs/vhost-user.txt   | 4 ++--
> >  hw/net/vhost_net.c  | 2 +-
> >  hw/virtio/vhost-user.c  | 8 
> >  linux-headers/linux/vhost.h | 2 +-
> >  tests/vhost-user-test.c | 2 +-
> >  5 files changed, 9 insertions(+), 9 deletions(-)
> > 
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 4eadad1..4bcd17d 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -211,10 +211,10 @@ Message types
> >as an owner of the session. This can be used on the Slave as a
> >"session start" flag.
> >  
> > - * VHOST_USER_RESET_DEVICE
> > + * VHOST_USER_RESET_OWNER
> >  
> >Id: 4
> > -  Equivalent ioctl: VHOST_RESET_DEVICE
> > +  Equivalent ioctl: VHOST_RESET_OWNER
> >Master payload: N/A
> >  
> >Issued when a new connection is about to be closed. The Master will 
> > no
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index 2bce891..804f5c9 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
> >  } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
> >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > -int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
> > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> >NULL);
> >  assert(r >= 0);
> >  }
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index b11c0d2..12a9104 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_GET_FEATURES = 1,
> >  VHOST_USER_SET_FEATURES = 2,
> >  VHOST_USER_SET_OWNER = 3,
> > -VHOST_USER_RESET_DEVICE = 4,
> > +VHOST_USER_RESET_OWNER = 4,
> >  VHOST_USER_SET_MEM_TABLE = 5,
> >  VHOST_USER_SET_LOG_BASE = 6,
> >  VHOST_USER_SET_LOG_FD = 7,
> > @@ -102,7 +102,7 @@ static unsigned long int 
> > ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> >  VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> >  VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> >  VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
> > -VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
> > +VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
> >  VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
> >  VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
> >  VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
> > @@ -192,7 +192,7 @@ static bool 
> > vhost_user_one_time_request(VhostUserRequest request)
> >  {
> >  switch (request) {
> >  case VHOST_USER_SET_OWNER:
> > -case VHOST_USER_RESET_DEVICE:
> > +case VHOST_USER_RESET_OWNER:
> >  case VHOST_USER_SET_MEM_TABLE:
> >  case VHOST_USER_GET_QUEUE_NUM:
> >  return true;
> > @@ -249,7 +249,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
> > unsigned long int request,
> >  break;
> >  
> >  case VHOST_USER_SET_OWNER:
> > -case VHOST_USER_RESET_DEVICE:
> > +case VHOST_USER_RESET_OWNER:
> >  break;
> >  
> >  case VHOST_USER_SET_MEM_TABLE:
> > diff --git a/l

[Qemu-devel] [PATCH 1/3] Revert "vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE"

2015-10-16 Thread Yuanhan Liu

This reverts commit d1f8b30ec8dde0318fd1b98d24a64926feae9625.

It turned out that it breaks stuff, so revert it:

http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg00949.html

CC: "Michael S. Tsirkin" 
Reported-by: Paolo Bonzini 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/net/vhost_net.c  | 2 +-
 hw/virtio/vhost-user.c  | 8 
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-test.c | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 4eadad1..4bcd17d 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -211,10 +211,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_DEVICE
+ * VHOST_USER_RESET_OWNER
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_DEVICE
+  Equivalent ioctl: VHOST_RESET_OWNER
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 2bce891..804f5c9 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -296,7 +296,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
 for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
+int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
   NULL);
 assert(r >= 0);
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index b11c0d2..12a9104 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -34,7 +34,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -102,7 +102,7 @@ static unsigned long int 
ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
 VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
 VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
 VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
-VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
+VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
 VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
 VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
 VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
@@ -192,7 +192,7 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 {
 switch (request) {
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 case VHOST_USER_SET_MEM_TABLE:
 case VHOST_USER_GET_QUEUE_NUM:
 return true;
@@ -249,7 +249,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_DEVICE:
+case VHOST_USER_RESET_OWNER:
 break;
 
 case VHOST_USER_SET_MEM_TABLE:
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 14a0160..ead86db 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 56df5cc..f181391 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -53,7 +53,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_DEVICE = 4,
+VHOST_USER_RESET_OWNER = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
-- 
1.9.0

[Qemu-devel] [PATCH 2/3] doc: vhost-user: request naming fix

2015-10-16 Thread Yuanhan Liu

They are VHOST_USER_XXX instead of VHOST_XXX messages.

Also, add VHOST_USER_GET_QUEUE_NUM to the section that
requries replies.

Cc: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 4bcd17d..b23d88f 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -112,18 +112,19 @@ The communication consists of master sending message 
requests and slave sending
 message replies. Most of the requests don't require replies. Here is a list of
 the ones that do:
 
- * VHOST_GET_FEATURES
- * VHOST_GET_PROTOCOL_FEATURES
- * VHOST_GET_VRING_BASE
+ * VHOST_USER_GET_FEATURES
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+ * VHOST_USER_GET_VRING_BASE
+ * VHOST_USER_GET_QUEUE_NUM
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
- * VHOST_SET_MEM_TABLE
- * VHOST_SET_LOG_FD
- * VHOST_SET_VRING_KICK
- * VHOST_SET_VRING_CALL
- * VHOST_SET_VRING_ERR
+ * VHOST_USER_SET_MEM_TABLE
+ * VHOST_USER_SET_LOG_FD
+ * VHOST_USER_SET_VRING_KICK
+ * VHOST_USER_SET_VRING_CALL
+ * VHOST_USER_SET_VRING_ERR
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
-- 
1.9.0

[Qemu-devel] [PATCH 3/3] vhost-user-test: add multiple queue test

2015-10-16 Thread Yuanhan Liu

Setting VHOST_USER_PROTOCOL_F_MQ protocol feature bit to claim that we
support MQ feature, and simply assume we support 2 queue pairs at most.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Yuanhan Liu 
---
 tests/vhost-user-test.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index f181391..f0aa36f 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -33,7 +33,7 @@
 #define QEMU_CMD_MEM" -m 512 -object 
memory-backend-file,id=mem,size=512M,"\
 "mem-path=%s,share=on -numa node,memdev=mem"
 #define QEMU_CMD_CHR" -chardev socket,id=chr0,path=%s"
-#define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=chr0,vhostforce"
+#define QEMU_CMD_NETDEV " -netdev 
vhost-user,id=net0,chardev=chr0,vhostforce,queues=2"
 #define QEMU_CMD_NET" -device virtio-net-pci,netdev=net0 "
 #define QEMU_CMD_ROM" -option-rom ../pc-bios/pxe-virtio.rom"
 
@@ -48,6 +48,8 @@
 
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+#define VHOST_USER_PROTOCOL_F_MQ 0
+
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
 VHOST_USER_GET_FEATURES = 1,
@@ -66,6 +68,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_GET_QUEUE_NUM = 17,
+VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -232,7 +236,7 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
 /* send back features to qemu */
 msg.flags |= VHOST_USER_REPLY_MASK;
 msg.size = sizeof(m.u64);
-msg.u64 = 0;
+msg.u64 = (1ULL << VHOST_USER_PROTOCOL_F_MQ);
 p = (uint8_t *) &msg;
 qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
 break;
@@ -266,6 +270,16 @@ static void chr_read(void *opaque, const uint8_t *buf, int 
size)
  */
 qemu_set_nonblock(fd);
 break;
+
+case VHOST_USER_GET_QUEUE_NUM:
+ /* send back the number of queues we support (let it be 2) to qemu */
+msg.flags |= VHOST_USER_REPLY_MASK;
+msg.size = sizeof(m.u64);
+msg.u64 = 2;
+p = (uint8_t *) &msg;
+qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
+break;
+
 default:
 break;
 }
-- 
1.9.0

Re: [Qemu-devel] [PULL 12/22] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-10-07 Thread Yuanhan Liu

On Sat, Oct 03, 2015 at 07:33:16PM +0300, Michael S. Tsirkin wrote:
> On Fri, Oct 02, 2015 at 06:18:51PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 24/09/2015 15:20, Michael S. Tsirkin wrote:
> > > From: Yuanhan Liu 
> > > 
> > > Quote from Michael:
> > > 
> > > We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.
> > 
> > Where is the corresponding Linux patch for this?
> > 
> > I would like to fetch the updated headers for KVM, and this is breaking
> > it.  In fact, a patch that just renames the #define (without providing
> > the old name for backwards compatibility) would be NACKed in upstream Linux.
> > 
> > Paolo
> 
> Right. And it turns out this whole approach is wrong.  I intend to
> revert this patch,

I was considering to put this patch as the last one in this set, so that
we could drop (or revert) it if it's turned out to be wrong (I had vague
feeling that it was wrong).

Luckily, this patch could be reverted successfully (I firstly thought
there might be conflicts). Besides that, we need one extra manual fix:

[yliu@yliu-dev ~/qemu]$ gg 'VHOST.*RESET_DEVICE'
hw/virtio/vhost-user.c:195:case VHOST_USER_RESET_DEVICE:

Michael, shall I send the revert request, or you will do that?

> and also drop the patch sending VHOST_RESET_OWNER on
> device stop.

Something wrong there?

--yliu

> 
> > > Suggested-by: Michael S. Tsirkin 
> > > Signed-off-by: Yuanhan Liu 
> > > Reviewed-by: Michael S. Tsirkin 
> > > Signed-off-by: Michael S. Tsirkin 
> > > Signed-off-by: Yuanhan Liu 
> > > Reviewed-by: Marcel Apfelbaum 
> > > ---
> > >  linux-headers/linux/vhost.h | 2 +-
> > >  hw/net/vhost_net.c  | 2 +-
> > >  hw/virtio/vhost-user.c  | 6 +++---
> > >  tests/vhost-user-test.c | 2 +-
> > >  docs/specs/vhost-user.txt   | 4 ++--
> > >  5 files changed, 8 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > index ead86db..14a0160 100644
> > > --- a/linux-headers/linux/vhost.h
> > > +++ b/linux-headers/linux/vhost.h
> > > @@ -78,7 +78,7 @@ struct vhost_memory {
> > >  #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
> > >  /* Give up ownership, and reset the device to default values.
> > >   * Allows subsequent call to VHOST_OWNER_SET to succeed. */
> > > -#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
> > > +#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
> > >  
> > >  /* Set up/modify memory layout */
> > >  #define VHOST_SET_MEM_TABLE  _IOW(VHOST_VIRTIO, 0x03, struct 
> > > vhost_memory)
> > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > index 9d32d76..b7d29b7 100644
> > > --- a/hw/net/vhost_net.c
> > > +++ b/hw/net/vhost_net.c
> > > @@ -287,7 +287,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
> > >  } else if (net->nc->info->type == 
> > > NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
> > >  for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
> > >  const VhostOps *vhost_ops = net->dev.vhost_ops;
> > > -int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
> > > +int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
> > >NULL);
> > >  assert(r >= 0);
> > >  }
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index 7fe35c6..9cb2f52 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -32,7 +32,7 @@ typedef enum VhostUserRequest {
> > >  VHOST_USER_GET_FEATURES = 1,
> > >  VHOST_USER_SET_FEATURES = 2,
> > >  VHOST_USER_SET_OWNER = 3,
> > > -VHOST_USER_RESET_OWNER = 4,
> > > +VHOST_USER_RESET_DEVICE = 4,
> > >  VHOST_USER_SET_MEM_TABLE = 5,
> > >  VHOST_USER_SET_LOG_BASE = 6,
> > >  VHOST_USER_SET_LOG_FD = 7,
> > > @@ -98,7 +98,7 @@ static unsigned long int 
> > > ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> > >  VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> > >  VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> > >  VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
> > > -VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
> > > +VHOST_RESET_DEVICE,  /* VHOST_USER_RES

[Qemu-devel] [PATCH v12 5/7] vhost: introduce vhost_backend_get_vq_index method

2015-09-24 Thread Yuanhan Liu

Subtracting the idx with the base(dev->vq_index) for vhost-kernel, and
then adding it back for vhost-user doesn't seem right. Here introduces
a new method vhost_backend_get_vq_index() for getting the right vq
index for following vhost messages calls.

Suggested-by: Jason Wang 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Jason Wang 
Tested-by: Marcel Apfelbaum 
---
 hw/virtio/vhost-backend.c | 10 +-
 hw/virtio/vhost-user.c| 12 ++--
 hw/virtio/vhost.c | 15 ++-
 include/hw/virtio/vhost-backend.h |  2 ++
 4 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 4d68a27..72d1392 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -42,11 +42,19 @@ static int vhost_kernel_cleanup(struct vhost_dev *dev)
 return close(fd);
 }
 
+static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
+{
+assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
+return idx - dev->vq_index;
+}
+
 static const VhostOps kernel_ops = {
 .backend_type = VHOST_BACKEND_TYPE_KERNEL,
 .vhost_call = vhost_kernel_call,
 .vhost_backend_init = vhost_kernel_init,
-.vhost_backend_cleanup = vhost_kernel_cleanup
+.vhost_backend_cleanup = vhost_kernel_cleanup,
+.vhost_backend_get_vq_index = vhost_kernel_get_vq_index,
 };
 
 int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType 
backend_type)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 694fde5..5018fd6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -393,9 +393,17 @@ static int vhost_user_cleanup(struct vhost_dev *dev)
 return 0;
 }
 
+static int vhost_user_get_vq_index(struct vhost_dev *dev, int idx)
+{
+assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
+return idx;
+}
+
 const VhostOps user_ops = {
 .backend_type = VHOST_BACKEND_TYPE_USER,
 .vhost_call = vhost_user_call,
 .vhost_backend_init = vhost_user_init,
-.vhost_backend_cleanup = vhost_user_cleanup
-};
+.vhost_backend_cleanup = vhost_user_cleanup,
+.vhost_backend_get_vq_index = vhost_user_get_vq_index,
+};
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a08c36b..7a7812d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -719,7 +719,7 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 {
 hwaddr s, l, a;
 int r;
-int vhost_vq_index = idx - dev->vq_index;
+int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, idx);
 struct vhost_vring_file file = {
 .index = vhost_vq_index
 };
@@ -728,7 +728,6 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 };
 struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
 
-assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
 
 vq->num = state.num = virtio_queue_get_num(vdev, idx);
 r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_NUM, &state);
@@ -822,12 +821,12 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
 struct vhost_virtqueue *vq,
 unsigned idx)
 {
-int vhost_vq_index = idx - dev->vq_index;
+int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, idx);
 struct vhost_vring_state state = {
 .index = vhost_vq_index,
 };
 int r;
-assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
 r = dev->vhost_ops->vhost_call(dev, VHOST_GET_VRING_BASE, &state);
 if (r < 0) {
 fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
@@ -1066,17 +1065,15 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, 
VirtIODevice *vdev, int n,
 {
 struct VirtQueue *vvq = virtio_get_queue(vdev, n);
 int r, index = n - hdev->vq_index;
+struct vhost_vring_file file;
 
-assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
-
-struct vhost_vring_file file = {
-.index = index
-};
 if (mask) {
 file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
 } else {
 file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
 }
+
+file.index = hdev->vhost_ops->vhost_backend_get_vq_index(hdev, n);
 r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_VRING_CALL, &file);
 assert(r >= 0);
 }
diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index e472f29..e1dfc6d 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -24,12 +24,14 @@ typedef int (*vhost_call)(struct vhost_dev *dev, unsigned 
long int request,
  void *arg);

[Qemu-devel] [PATCH v12 3/7] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-09-24 Thread Yuanhan Liu

Quote from Michael:

> Hi Michael,
>
> I'm wondering why we need a payload here, as we don't do that for
> VHOST_SET_OWNER. I mean, stopping one or few queue pairs when a
> connect is about to be close doesn't make sense to me. Instead,
> we should clean up all queue pair when VHOST_RESET_OWNER message
> is received, right?

We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.

http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg00050.html

To make it sound clear that it's a per-device message, instead of per-queue
or something else.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Marcel Apfelbaum 
Tested-by: Marcel Apfelbaum 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/net/vhost_net.c  | 2 +-
 hw/virtio/vhost-user.c  | 6 +++---
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-test.c | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index eec3251..af4b1b9 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -194,10 +194,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_OWNER
+ * VHOST_USER_RESET_DEVICE
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_OWNER
+  Equivalent ioctl: VHOST_RESET_DEVICE
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9d32d76..b7d29b7 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -287,7 +287,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
 for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
+int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
   NULL);
 assert(r >= 0);
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 7fe35c6..9cb2f52 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -32,7 +32,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_OWNER = 4,
+VHOST_USER_RESET_DEVICE = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -98,7 +98,7 @@ static unsigned long int 
ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
 VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
 VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
 VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
-VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
+VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
 VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
 VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
 VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
@@ -222,7 +222,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_OWNER:
+case VHOST_USER_RESET_DEVICE:
 break;
 
 case VHOST_USER_SET_MEM_TABLE:
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index ead86db..14a0160 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 75fedf0..e301db7 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -58,7 +58,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_OWNER = 4,
+VHOST_USER_RESET_DEVICE = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
-- 
1.9.0

[Qemu-devel] [PATCH v12 7/7] vhost-user: add a new message to disable/enable a specific virt queue.

2015-09-24 Thread Yuanhan Liu

From: Changchun Ouyang 

Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable
a specific virt queue, which is similar to attach/detach queue for
tap device.

virtio driver on guest doesn't have to use max virt queue pair, it
could enable any number of virt queue ranging from 1 to max virt
queue pair.

Signed-off-by: Changchun Ouyang 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Jason Wang 
Tested-by: Marcel Apfelbaum 
---
 docs/specs/vhost-user.txt | 12 +++-
 hw/net/vhost_net.c| 18 ++
 hw/net/virtio-net.c   |  8 
 hw/virtio/vhost-user.c| 19 +++
 include/hw/virtio/vhost-backend.h |  2 ++
 include/net/vhost_net.h   |  2 ++
 6 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 477edd1..904a6f2 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -149,7 +149,9 @@ VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when 
the number of
 requested queues is bigger than that.
 
 As all queues share one connection, the master uses a unique index for each
-queue in the sent message to identify a specified queue.
+queue in the sent message to identify a specified queue. One queue pair
+is enabled initially. More queues are enabled dynamically, by sending
+message VHOST_USER_SET_VRING_ENABLE.
 
 Message types
 -
@@ -328,3 +330,11 @@ Message types
   Query how many queues the backend supports. This request should be
   sent only when VHOST_USER_PROTOCOL_F_MQ is set in quried protocol
   features by VHOST_USER_GET_PROTOCOL_FEATURES.
+
+ * VHOST_USER_SET_VRING_ENABLE
+
+  Id: 18
+  Equivalent ioctl: N/A
+  Master payload: vring state description
+
+  Signal slave to enable or disable corresponding vring.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index e2cb40b..1dcac1d 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -422,6 +422,19 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 
 return vhost_net;
 }
+
+int vhost_set_vring_enable(NetClientState *nc, int enable)
+{
+VHostNetState *net = get_vhost_net(nc);
+const VhostOps *vhost_ops = net->dev.vhost_ops;
+
+if (vhost_ops->vhost_backend_set_vring_enable) {
+return vhost_ops->vhost_backend_set_vring_enable(&net->dev, enable);
+}
+
+return 0;
+}
+
 #else
 uint64_t vhost_net_get_max_queues(VHostNetState *net)
 {
@@ -472,4 +485,9 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 {
 return 0;
 }
+
+int vhost_set_vring_enable(NetClientState *nc, int enable)
+{
+return 0;
+}
 #endif
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f72eebf..916222f 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -406,6 +406,10 @@ static int peer_attach(VirtIONet *n, int index)
 return 0;
 }
 
+if (nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
+vhost_set_vring_enable(nc->peer, 1);
+}
+
 if (nc->peer->info->type != NET_CLIENT_OPTIONS_KIND_TAP) {
 return 0;
 }
@@ -421,6 +425,10 @@ static int peer_detach(VirtIONet *n, int index)
 return 0;
 }
 
+if (nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
+vhost_set_vring_enable(nc->peer, 0);
+}
+
 if (nc->peer->info->type !=  NET_CLIENT_OPTIONS_KIND_TAP) {
 return 0;
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index d6850d5..aca10a2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -48,6 +48,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
 VHOST_USER_GET_QUEUE_NUM = 17,
+VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -290,6 +291,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 
 case VHOST_USER_SET_VRING_NUM:
 case VHOST_USER_SET_VRING_BASE:
+case VHOST_USER_SET_VRING_ENABLE:
 memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
 break;
@@ -406,6 +408,22 @@ static int vhost_user_init(struct vhost_dev *dev, void 
*opaque)
 return 0;
 }
 
+static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
+{
+struct vhost_vring_state state = {
+.index = dev->vq_index,
+.num   = enable,
+};
+
+assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
+
+if (!(dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ))) {
+return -1;
+}
+
+return vhost_user_call(dev, VHOST_USER_SET_VRING_ENABLE, &state);
+}
+
 static int vhost_user_cleanup(struct vhost_dev *dev)
 {
 assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
@@ -428,4 +446,5 @@ const VhostOps user_ops = {

[Qemu-devel] [PATCH v12 4/7] vhost-user: add VHOST_USER_GET_QUEUE_NUM message

2015-09-24 Thread Yuanhan Liu

This is for querying how many queues the backend supports if it has mq
support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the queried
protocol features).

vhost_net_get_max_queues() is the interface to export that value, and
to tell if the backend supports # of queues user requested, which is
done in the following patch.

Signed-off-by: Yuanhan Liu 
Reviewed-by: Marcel Apfelbaum 
Tested-by: Marcel Apfelbaum 

---
v12: mark VHOST_USER_GET_QUEUE_NUM as a message need reply.
---
 docs/specs/vhost-user.txt | 12 
 hw/net/vhost_net.c| 12 
 hw/virtio/vhost-user.c| 15 ++-
 include/hw/virtio/vhost.h |  1 +
 include/net/vhost_net.h   |  1 +
 5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index af4b1b9..5346df2 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -115,6 +115,7 @@ the ones that do:
  * VHOST_GET_FEATURES
  * VHOST_GET_PROTOCOL_FEATURES
  * VHOST_GET_VRING_BASE
+ * VHOST_USER_GET_QUEUE_NUM
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
@@ -301,3 +302,14 @@ Message types
   Bits (0-7) of the payload contain the vring index. Bit 8 is the
   invalid FD flag. This flag is set when there is no file descriptor
   in the ancillary data.
+
+ * VHOST_USER_GET_QUEUE_NUM
+
+  Id: 17
+  Equivalent ioctl: N/A
+  Master payload: N/A
+  Slave payload: u64
+
+  Query how many queues the backend supports. This request should be
+  sent only when VHOST_USER_PROTOCOL_F_MQ is set in quried protocol
+  features by VHOST_USER_GET_PROTOCOL_FEATURES.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index b7d29b7..f663e5a 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -122,6 +122,11 @@ void vhost_net_ack_features(struct vhost_net *net, 
uint64_t features)
 vhost_ack_features(&net->dev, vhost_net_get_feature_bits(net), features);
 }
 
+uint64_t vhost_net_get_max_queues(VHostNetState *net)
+{
+return net->dev.max_queues;
+}
+
 static int vhost_net_get_fd(NetClientState *backend)
 {
 switch (backend->info->type) {
@@ -144,6 +149,8 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 goto fail;
 }
 
+net->dev.max_queues = 1;
+
 if (backend_kernel) {
 r = vhost_net_get_fd(options->net_backend);
 if (r < 0) {
@@ -414,6 +421,11 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 return vhost_net;
 }
 #else
+uint64_t vhost_net_get_max_queues(VHostNetState *net)
+{
+return 1;
+}
+
 struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
 error_report("vhost-net support is not compiled in");
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 9cb2f52..694fde5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -25,7 +25,9 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
-#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
+
+#define VHOST_USER_PROTOCOL_F_MQ0
 
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
@@ -45,6 +47,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_GET_QUEUE_NUM = 17,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -211,6 +214,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
 case VHOST_USER_GET_PROTOCOL_FEATURES:
+case VHOST_USER_GET_QUEUE_NUM:
 need_reply = 1;
 break;
 
@@ -315,6 +319,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
 case VHOST_USER_GET_PROTOCOL_FEATURES:
+case VHOST_USER_GET_QUEUE_NUM:
 if (msg.size != sizeof(m.u64)) {
 error_report("Received bad msg size.");
 return -1;
@@ -366,6 +371,14 @@ static int vhost_user_init(struct vhost_dev *dev, void 
*opaque)
 if (err < 0) {
 return err;
 }
+
+/* query the max queues we support if backend supports Multiple Queue 
*/
+if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ)) {
+err = vhost_user_call(dev, VHOST_USER_GET_QUEUE_NUM, 
&dev->max_queues);
+if (err < 0) {
+return err;
+}
+}
 }
 
 return 0;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6467c73..c3758f3 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -48,6 +48,7 @@ struct vhost_dev {
 unsigned long long acked_features;
 unsigned long long backend_features;
 unsigned long long p

[Qemu-devel] [PATCH v12 2/7] vhost-user: add protocol feature negotiation

2015-09-24 Thread Yuanhan Liu

From: "Michael S. Tsirkin" 

Support a separate bitmask for vhost-user protocol features,
and messages to get/set protocol features.

Invoke them at init.

No features are defined yet.

[ leverage vhost_user_call for request handling -- Yuanhan Liu ]

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Marcel Apfelbaum 
Tested-by: Marcel Apfelbaum 
---
 docs/specs/vhost-user.txt | 37 +
 hw/net/vhost_net.c|  2 ++
 hw/virtio/vhost-user.c| 31 +++
 include/hw/virtio/vhost.h |  1 +
 4 files changed, 71 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..eec3251 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -113,6 +113,7 @@ message replies. Most of the requests don't require 
replies. Here is a list of
 the ones that do:
 
  * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
  * VHOST_GET_VRING_BASE
 
 There are several messages that the master sends with file descriptors passed
@@ -127,6 +128,13 @@ in the ancillary data:
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
 
+Any protocol extensions are gated by protocol feature bits,
+which allows full backwards compatibility on both master
+and slave.
+As older slaves don't support negotiating protocol features,
+a feature bit was dedicated for this purpose:
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+
 Message types
 -
 
@@ -138,6 +146,8 @@ Message types
   Slave payload: u64
 
   Get from the underlying vhost implementation the features bitmask.
+  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
+  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
 
  * VHOST_USER_SET_FEATURES
 
@@ -146,6 +156,33 @@ Message types
   Master payload: u64
 
   Enable features in the underlying vhost implementation using a bitmask.
+  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
+  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
+
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+
+  Id: 15
+  Equivalent ioctl: N/A
+  Master payload: N/A
+  Slave payload: u64
+
+  Get the protocol feature bitmask from the underlying vhost 
implementation.
+  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
+  VHOST_USER_GET_FEATURES.
+  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
+  this message even before VHOST_USER_SET_FEATURES was called.
+
+ * VHOST_USER_SET_PROTOCOL_FEATURES
+
+  Id: 16
+  Equivalent Ioctl: N/A
+  Master payload: u64
+
+  Enable protocol features in the underlying vhost implementation.
+  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
+  VHOST_USER_GET_FEATURES.
+  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
+  this message even before VHOST_USER_SET_FEATURES was called.
 
  * VHOST_USER_SET_OWNER
 
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 1d76b94..9d32d76 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
 ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
 net->backend = r;
+net->dev.protocol_features = 0;
 } else {
 net->dev.backend_features = 0;
+net->dev.protocol_features = 0;
 net->backend = -1;
 }
 net->nc = options->net_backend;
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 13677ac..7fe35c6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -24,6 +24,8 @@
 #include 
 
 #define VHOST_MEMORY_MAX_NREGIONS8
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
 
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
@@ -41,6 +43,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_KICK = 12,
 VHOST_USER_SET_VRING_CALL = 13,
 VHOST_USER_SET_VRING_ERR = 14,
+VHOST_USER_GET_PROTOCOL_FEATURES = 15,
+VHOST_USER_SET_PROTOCOL_FEATURES = 16,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -206,11 +210,13 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
+case VHOST_USER_GET_PROTOCOL_FEATURES:
 need_reply = 1;
 break;
 
 case VHOST_USER_SET_FEATURES:
 case VHOST_USER_SET_LOG_BASE:
+case VHOST_USER_SET_PROTOCOL_FEATURES:
 msg.u64 = *((__u64 *) arg);
 msg.size = sizeof(m.u64);
 break;
@@ -308,6 +314,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int req

[Qemu-devel] [PATCH v12 6/7] vhost-user: add multiple queue support

2015-09-24 Thread Yuanhan Liu

From: Changchun Ouyang 

This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support the number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: dev specific messages, which
should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_dev_request(), which
lists following messages as dev specific messages:

VHOST_USER_SET_OWNER
VHOST_USER_RESET_DEVICE
VHOST_USER_SET_MEM_TABLE
VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev 
Signed-off-by: Changchun Ouyang 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Jason Wang 
Tested-by: Marcel Apfelbaum 

---
v12: rename vhost_user_one_time_request() to vhost_user_dev_request().

 use vhost_net_set_vq_index() instead direct assignment.

v11: inovke qmp_set_link() directly
---
 docs/specs/vhost-user.txt |  15 +
 hw/net/vhost_net.c|  20 ---
 hw/virtio/vhost-user.c|  22 
 hw/virtio/vhost.c |   5 +-
 net/vhost-user.c  | 141 ++
 qapi-schema.json  |   6 +-
 qemu-options.hx   |   5 +-
 7 files changed, 152 insertions(+), 62 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5346df2..477edd1 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -136,6 +136,21 @@ As older slaves don't support negotiating protocol 
features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+Multiple queue support
+--
+
+Multiple queue is treated as a protocol extension, hence the slave has to
+implement protocol features first. The multiple queues feature is supported
+only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
+#define VHOST_USER_PROTOCOL_F_MQ0
+
+The max number of queues the slave supports can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
+requested queues is bigger than that.
+
+As all queues share one connection, the master uses a unique index for each
+queue in the sent message to identify a specified queue.
+
 Message types
 -
 
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f663e5a..e2cb40b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -138,6 +138,11 @@ static int vhost_net_get_fd(NetClientState *backend)
 }
 }
 
+static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index)
+{
+net->dev.vq_index = vq_index;
+}
+
 struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
 int r;
@@ -148,8 +153,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 fprintf(stderr, "vhost-net requires net backend to be setup\n");
 goto fail;
 }
+net->nc = options->net_backend;
 
 net->dev.max_queues = 1;
+net->dev.nvqs = 2;
+net->dev.vqs = net->vqs;
 
 if (backend_kernel) {
 r = vhost_net_get_fd(options->net_backend);
@@ -164,11 +172,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 net->dev.backend_features = 0;
 net->dev.protocol_features = 0;
 net->backend = -1;
-}
-net->nc = options->net_backend;
 
-net->dev.nvqs = 2;
-net->dev.vqs = net->vqs;
+/* vhost-user needs vq_index to initiate a specific queue pair */
+vhost_net_set_vq_index(net, net->nc->queue_index * net->dev.nvqs);
+}
 
 r = vhost_dev_init(&net->dev, options->opaque,
options->backend_type);
@@ -196,11 +203,6 @@ fail:
 return NULL;
 }
 
-static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index)
-{
-net->dev.vq_index = vq_index;
-}
-
 static int vhost_net_set_vnet_endian(VirtIODevice *dev, NetClientState *peer,
  bool set)
 {
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 5018fd6..d6850d5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -187,6 +187,19 @@ static int vhost_user_write(struct vhost_dev *dev, 
VhostUserMsg *msg,
 0 : -1;
 }
 
+static bool vhost_user_dev_request(V

[Qemu-devel] [PATCH v12 0/7] vhost-user multiple queue support

2015-09-24 Thread Yuanhan Liu

Hi,

Here is the updated patch set for enabling vhost-user multiple queue. It works
well in my test, and it also passes Marcel's testing (thank him for testing!).

---

There are only few minor changes comparing with last version, such as typo
fixes, function rename, and something else.

The reason I made another formal version is, after addressing all comments
I got so far, that I hope Michael could consider to merge this patch set.
And if it's still not ready, Michael, please kindly tell me so that I can
fix them.




This patch set introduces 2 more vhost user messages: VHOST_USER_GET_QUEUE_NUM,
for querying how many queues the backend supports, and 
VHOST_USER_SET_VRING_ENABLE,
for enabling/disabling a specific virt queue.

Both of the two new messages are treated as vhost protocol extension,
and that's why Michaels's patch "vhost-user: add protocol feature
negotiation" is also included here.

Patch 1-5 are all prepare works for actually enabling multiple queue.

Patch 6 is the major patch for enabling multiple queue, which also tries
to address two major concerns from Michael: no feedback from backend if
it can't support # of requested queues, and all messages are sent N time.
It also fixes a hidden bug.

Patch 7 introduces the VHOST_USER_SET_VRING_ENABLE message, to enable
or disable a specific vring.


v12: - typo fixes, function rename, and so on.


v11: - typo fixes pointed out by Eric

 - define a dummy vhost_net_get_max_queues() when !CONFIG_VHOST_NET.

 - Per suggested by Jason Wang, invoke qmp_set_link() directly to remove
   duplicate code.


v10: - typo fixes pointed out by Eric

 - [PATCH 6]: don't treat VHOST_USER_SET/GET_[PROTOCOL]_FEATURES as
   one time request, as the two feature bits need to be stored at
   per-device.

v9: - Per suggested by Jason Wang, patch 5 introduces a new vhost
  backend method: vhost_backend_get_vq_index().

- Use qemu_find_net_clients_except() at net/vhost-user.c for
  gathering all related ncs so that we could register chr dev
  event handler once. Which is also suggested by Jason Wang.


Thanks.

--yliu

---
Changchun Ouyang (2):
  vhost-user: add multiple queue support
  vhost-user: add a new message to disable/enable a specific virt queue.

Michael S. Tsirkin (1):
  vhost-user: add protocol feature negotiation

Yuanhan Liu (4):
  vhost-user: use VHOST_USER_XXX macro for switch statement
  vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
  vhost-user: add VHOST_USER_GET_QUEUE_NUM message
  vhost: introduce vhost_backend_get_vq_index method

 docs/specs/vhost-user.txt |  78 -
 hw/net/vhost_net.c|  54 ---
 hw/net/virtio-net.c   |   8 +++
 hw/virtio/vhost-backend.c |  10 ++-
 hw/virtio/vhost-user.c| 139 +++--
 hw/virtio/vhost.c |  20 +++---
 include/hw/virtio/vhost-backend.h |   4 ++
 include/hw/virtio/vhost.h |   2 +
 include/net/vhost_net.h   |   3 +
 linux-headers/linux/vhost.h   |   2 +-
 net/vhost-user.c  | 141 +-
 qapi-schema.json  |   6 +-
 qemu-options.hx   |   5 +-
 tests/vhost-user-test.c   |   2 +-
 14 files changed, 377 insertions(+), 97 deletions(-)

-- 
1.9.0

[Qemu-devel] [PATCH v12 1/7] vhost-user: use VHOST_USER_XXX macro for switch statement

2015-09-24 Thread Yuanhan Liu

So that we could let vhost_user_call to handle extented requests,
such as VHOST_USER_GET/SET_PROTOCOL_FEATURES, instead of invoking
vhost_user_read/write and constructing the msg again by ourself.

Signed-off-by: Yuanhan Liu 
Reviewed-by: Marcel Apfelbaum 
Tested-by: Marcel Apfelbaum 
---
 hw/virtio/vhost-user.c | 38 ++
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index e7ab829..13677ac 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -193,27 +193,33 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
 
-msg_request = vhost_user_request_translate(request);
+/* only translate vhost ioctl requests */
+if (request > VHOST_USER_MAX) {
+msg_request = vhost_user_request_translate(request);
+} else {
+msg_request = request;
+}
+
 msg.request = msg_request;
 msg.flags = VHOST_USER_VERSION;
 msg.size = 0;
 
-switch (request) {
-case VHOST_GET_FEATURES:
+switch (msg_request) {
+case VHOST_USER_GET_FEATURES:
 need_reply = 1;
 break;
 
-case VHOST_SET_FEATURES:
-case VHOST_SET_LOG_BASE:
+case VHOST_USER_SET_FEATURES:
+case VHOST_USER_SET_LOG_BASE:
 msg.u64 = *((__u64 *) arg);
 msg.size = sizeof(m.u64);
 break;
 
-case VHOST_SET_OWNER:
-case VHOST_RESET_OWNER:
+case VHOST_USER_SET_OWNER:
+case VHOST_USER_RESET_OWNER:
 break;
 
-case VHOST_SET_MEM_TABLE:
+case VHOST_USER_SET_MEM_TABLE:
 for (i = 0; i < dev->mem->nregions; ++i) {
 struct vhost_memory_region *reg = dev->mem->regions + i;
 ram_addr_t ram_addr;
@@ -246,30 +252,30 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 break;
 
-case VHOST_SET_LOG_FD:
+case VHOST_USER_SET_LOG_FD:
 fds[fd_num++] = *((int *) arg);
 break;
 
-case VHOST_SET_VRING_NUM:
-case VHOST_SET_VRING_BASE:
+case VHOST_USER_SET_VRING_NUM:
+case VHOST_USER_SET_VRING_BASE:
 memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
 break;
 
-case VHOST_GET_VRING_BASE:
+case VHOST_USER_GET_VRING_BASE:
 memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
 need_reply = 1;
 break;
 
-case VHOST_SET_VRING_ADDR:
+case VHOST_USER_SET_VRING_ADDR:
 memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
 msg.size = sizeof(m.addr);
 break;
 
-case VHOST_SET_VRING_KICK:
-case VHOST_SET_VRING_CALL:
-case VHOST_SET_VRING_ERR:
+case VHOST_USER_SET_VRING_KICK:
+case VHOST_USER_SET_VRING_CALL:
+case VHOST_USER_SET_VRING_ERR:
 file = arg;
 msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
 msg.size = sizeof(m.u64);
-- 
1.9.0

Re: [Qemu-devel] [PATCH v11 2/7] vhost-user: add protocol feature negotiation

2015-09-24 Thread Yuanhan Liu

On Thu, Sep 24, 2015 at 01:13:24PM +0300, Marcel Apfelbaum wrote:
> >diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >index 1d76b94..9d32d76 100644
> >--- a/hw/net/vhost_net.c
> >+++ b/hw/net/vhost_net.c
> >@@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> >*options)
> >  net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
> >  ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
> >  net->backend = r;
> >+net->dev.protocol_features = 0;
> >  } else {
> >  net->dev.backend_features = 0;
> >+net->dev.protocol_features = 0;
> >  net->backend = -1;
> >  }
> 
> Maybe protocol_features assignment should be outside the if clause.
> (assigned to 0 in both cases)

Yeah, we could do that. However, it seems that it will take more effort,
for handling patch conflicts while rebase, than it worths. Therefore
I will keep it.

--yliu

Re: [Qemu-devel] [PATCH v11 2/7] vhost-user: add protocol feature negotiation

2015-09-24 Thread Yuanhan Liu

On Thu, Sep 24, 2015 at 01:13:24PM +0300, Marcel Apfelbaum wrote:
> On 09/23/2015 07:19 AM, Yuanhan Liu wrote:
> >From: "Michael S. Tsirkin" 
> >
> >Support a separate bitmask for vhost-user protocol features,
> >and messages to get/set protocol features.
> >
> >Invoke them at init.
> >
> >No features are defined yet.
> >
> >[ leverage vhost_user_call for request handling -- Yuanhan Liu ]
> >
> >Signed-off-by: Michael S. Tsirkin 
> >Signed-off-by: Yuanhan Liu 
> >---
> >  docs/specs/vhost-user.txt | 37 +
> >  hw/net/vhost_net.c|  2 ++
> >  hw/virtio/vhost-user.c| 31 +++
> >  include/hw/virtio/vhost.h |  1 +
> >  4 files changed, 71 insertions(+)
> >
> >diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> >index 650bb18..70da3b1 100644
> >--- a/docs/specs/vhost-user.txt
> >+++ b/docs/specs/vhost-user.txt
> >@@ -113,6 +113,7 @@ message replies. Most of the requests don't require 
> >replies. Here is a list of
> >  the ones that do:
> >
> >   * VHOST_GET_FEATURES
> >+ * VHOST_GET_PROTOCOL_FEATURES
> >   * VHOST_GET_VRING_BASE
> >
> >  There are several messages that the master sends with file descriptors 
> > passed
> >@@ -127,6 +128,13 @@ in the ancillary data:
> >  If Master is unable to send the full message or receives a wrong reply it 
> > will
> >  close the connection. An optional reconnection mechanism can be 
> > implemented.
> >
> >+Any protocol extensions are gated by protocol feature bits,
> >+which allows full backwards compatibility on both master
> >+and slave.
> >+As older slaves don't support negotiating protocol features,
> >+a feature bit was dedicated for this purpose:
> >+#define VHOST_USER_F_PROTOCOL_FEATURES 30
> >+
> >  Message types
> >  -
> >
> >@@ -138,6 +146,8 @@ Message types
> >Slave payload: u64
> >
> >Get from the underlying vhost implementation the features bitmask.
> >+  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> >+  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
> >
> >   * VHOST_USER_SET_FEATURES
> >
> >@@ -146,6 +156,33 @@ Message types
> >Master payload: u64
> >
> >Enable features in the underlying vhost implementation using a 
> > bitmask.
> >+  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
> >+  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
> >+
> >+ * VHOST_USER_GET_PROTOCOL_FEATURES
> >+
> >+  Id: 15
> >+  Equivalent ioctl: VHOST_GET_FEATURES
> 
> VHOST_USER_GET_PROTOCOL_FEATURES does not have an equivalent ioctl

Oops, I didn't notice that. Will fix it soon, and thank you for the review!


--yliu

> 
> >+  Master payload: N/A
> >+  Slave payload: u64
> >+
> >+  Get the protocol feature bitmask from the underlying vhost 
> >implementation.
> >+  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
> >+  VHOST_USER_GET_FEATURES.
> >+  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> >+  this message even before VHOST_USER_SET_FEATURES was called.
> >+
> >+ * VHOST_USER_SET_PROTOCOL_FEATURES
> >+
> >+  Id: 16
> >+  Ioctl: VHOST_SET_FEATURES
> 
> Same here
> 
> >+  Master payload: u64
> >+
> >+  Enable protocol features in the underlying vhost implementation.
> >+  Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
> >+  VHOST_USER_GET_FEATURES.
> >+  Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
> >+  this message even before VHOST_USER_SET_FEATURES was called.
> >
> >   * VHOST_USER_SET_OWNER
> >
> >diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >index 1d76b94..9d32d76 100644
> >--- a/hw/net/vhost_net.c
> >+++ b/hw/net/vhost_net.c
> >@@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> >*options)
> >  net->dev.backend_features = qemu_has_vnet_hdr(options->net_backend)
> >  ? 0 : (1ULL << VHOST_NET_F_VIRTIO_NET_HDR);
> >  net->backend = r;
> >+net->dev.protocol_features = 0;
> >  } else {
> >  net->dev.backend_features = 0;
> >+net->dev.protocol_features = 0;
> >

Re: [Qemu-devel] [PATCH v11 6/7] vhost-user: add multiple queue support

2015-09-23 Thread Yuanhan Liu

On Thu, Sep 24, 2015 at 01:34:31PM +0800, Jason Wang wrote:
> 
> 
> Some nitpicks and comments. If you plan to send another version, please
> consider to fix them.

I will, but I'd like to hold a while before getting more comments from
Michael; I don't want to repost a whole new version too often for minor
changes.

BTW, Michael, is this patchset being ready for merge? If not, please
kindly tell me what is wrong so that I can fix them. TBH, I'd really
like to see it be merged soon, as I'm gonna have holidays soon for
two weeks start from this Saturday. I could still reply emails, even
make patches during that, but I guess I only have limited time for that.

> 
> Reviewed-by: Jason Wang 

Thank you!

> 
> Btw, it's better to extend current vhost-user unit test to support
> multiqueue. Please consider this on top this series.

Yeah, I will. But, may I do that later? (And if possible, after the
merge?)

> 
> >
> > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> > index 43db9b4..cfc9d41 100644
> > --- a/docs/specs/vhost-user.txt
> > +++ b/docs/specs/vhost-user.txt
> > @@ -135,6 +135,21 @@ As older slaves don't support negotiating protocol 
> > features,
> >  a feature bit was dedicated for this purpose:
> >  #define VHOST_USER_F_PROTOCOL_FEATURES 30
> >  
> > +Multiple queue support
> > +--
> > +
> > +Multiple queue is treated as a protocol extension, hence the slave has to
> > +implement protocol features first. The multiple queues feature is supported
> > +only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
> > +#define VHOST_USER_PROTOCOL_F_MQ0
> > +
> > +The max number of queues the slave supports can be queried with message
> > +VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
> > +requested queues is bigger than that.
> > +
> > +As all queues share one connection, the master uses a unique index for each
> > +queue in the sent message to identify a specified queue.
> > +
> >  Message types
> >  -
> >  
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index f663e5a..ad82b5c 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -148,8 +148,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> > *options)
> >  fprintf(stderr, "vhost-net requires net backend to be setup\n");
> >  goto fail;
> >  }
> > +net->nc = options->net_backend;
> >  
> >  net->dev.max_queues = 1;
> > +net->dev.nvqs = 2;
> > +net->dev.vqs = net->vqs;
> >  
> >  if (backend_kernel) {
> >  r = vhost_net_get_fd(options->net_backend);
> > @@ -164,11 +167,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions 
> > *options)
> >  net->dev.backend_features = 0;
> >  net->dev.protocol_features = 0;
> >  net->backend = -1;
> > -}
> > -net->nc = options->net_backend;
> >  
> > -net->dev.nvqs = 2;
> > -net->dev.vqs = net->vqs;
> > +/* vhost-user needs vq_index to initiate a specific queue pair */
> > +net->dev.vq_index = net->nc->queue_index * net->dev.nvqs;
> 
> Better use vhost_net_set_vq_index() here.

I could do that.

> 
> > +}
> >  
> >  r = vhost_dev_init(&net->dev, options->opaque,
> > options->backend_type);
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index 5018fd6..e42fde6 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -187,6 +187,19 @@ static int vhost_user_write(struct vhost_dev *dev, 
> > VhostUserMsg *msg,
> >  0 : -1;
> >  }
> >  
> > +static bool vhost_user_one_time_request(VhostUserRequest request)
> > +{
> > +switch (request) {
> > +case VHOST_USER_SET_OWNER:
> > +case VHOST_USER_RESET_DEVICE:
> > +case VHOST_USER_SET_MEM_TABLE:
> > +case VHOST_USER_GET_QUEUE_NUM:
> > +return true;
> > +default:
> > +return false;
> > +}
> > +}
> > +
> >  static int vhost_user_call(struct vhost_dev *dev, unsigned long int 
> > request,
> >  void *arg)
> >  {
> > @@ -207,6 +220,15 @@ static int vhost_user_call(struct vhost_dev *dev, 
> > unsigned long int request,
> >  msg_request = request;
> >  }
> >  
> > +/*
> > + * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE,
> > + * we just need send it once in the first time. For later such
> > + * request, we just ignore it.
> > + */
> > +if (vhost_user_one_time_request(msg_request) && dev->vq_index != 0) {
> > +return 0;
> > +}
> 
> Looks like vhost_user_dev_request() is a better name here.

Yeah, indeed. Thanks.

> 
> > +
> >  msg.request = msg_request;
> >  msg.flags = VHOST_USER_VERSION;
> >  msg.size = 0;
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 7a7812d..c0ed5b2 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -874,8 +874,9 @@ static void vhost_eventfd_del(MemoryListener *listener,
> >  static in

Re: [Qemu-devel] [PATCH v11 6/7] vhost-user: add multiple queue support

2015-09-22 Thread Yuanhan Liu

On Wed, Sep 23, 2015 at 12:20:00PM +0800, Yuanhan Liu wrote:
> From: Changchun Ouyang 
> 
[...]
>  static void net_vhost_user_event(void *opaque, int event)
>  {
> -VhostUserState *s = opaque;
> +const char *name = opaque;
> +NetClientState *ncs[MAX_QUEUE_NUM];
> +VhostUserState *s;
> +Error *err = NULL;
> +int queues;
>  
> +queues = qemu_find_net_clients_except(name, ncs,
> +  NET_CLIENT_OPTIONS_KIND_NIC,
> +  MAX_QUEUE_NUM);
> +s = DO_UPCAST(VhostUserState, nc, ncs[0]);
>  switch (event) {
>  case CHR_EVENT_OPENED:
> -vhost_user_start(s);
> -net_vhost_link_down(s, false);
> +if (vhost_user_start(queues, ncs) < 0) {
> +exit(1);
> +}
> +qmp_set_link(name, true, &err);
>  error_report("chardev \"%s\" went up", s->chr->label);
>  break;
>  case CHR_EVENT_CLOSED:
> -net_vhost_link_down(s, true);
> -vhost_user_stop(s);
> +qmp_set_link(name, true, &err);
> +vhost_user_stop(queues, ncs);

Sigh... A silly Copy & Paste typo. Here is the udpated patch:

--yliu

---8<---
>From 3793772867024dfabb9eb8eb5c53ae6714f88618 Mon Sep 17 00:00:00 2001
From: Changchun Ouyang 
Date: Tue, 15 Sep 2015 14:28:38 +0800
Subject: [PATCH v11 6/7] vhost-user: add multiple queue support

This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support the number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: non-vring specific messages,
which should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_one_time_request(), which
lists following messages as non-vring specific messages:

VHOST_USER_SET_OWNER
VHOST_USER_RESET_DEVICE
VHOST_USER_SET_MEM_TABLE
VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev 
Signed-off-by: Changchun Ouyang 
Signed-off-by: Yuanhan Liu 

---
v11: inovke qmp_set_link() directly -- Suggested by Jason Wang

v10: don't treat VHOST_USER_SET/GET_[PROTOCOL]_FEATURES as one time
 request, as the two feature bits need to be stored at per-device.

v9: per suggested by Jason Wang, we could invoke qemu_chr_add_handlers()
once only, and invoke qemu_find_net_clients_except() at the handler
to gather all related ncs, for initiating all queues. Which addresses
a hidden bug in older verion in a more QEMU-like way.

v8: set net->dev.vq_index correctly inside vhost_net_init() based on the
value from net->nc->queue_index.
---
 docs/specs/vhost-user.txt |  15 +
 hw/net/vhost_net.c|  10 ++--
 hw/virtio/vhost-user.c|  22 
 hw/virtio/vhost.c |   5 +-
 net/vhost-user.c  | 141 ++
 qapi-schema.json  |   6 +-
 qemu-options.hx   |   5 +-
 7 files changed, 147 insertions(+), 57 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 43db9b4..cfc9d41 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,21 @@ As older slaves don't support negotiating protocol 
features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30

+Multiple queue support
+--
+
+Multiple queue is treated as a protocol extension, hence the slave has to
+implement protocol features first. The multiple queues feature is supported
+only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set:
+#define VHOST_USER_PROTOCOL_F_MQ0
+
+The max number of queues the slave supports can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
+requested queues is bigger than that.
+
+As all queues share one connection, the master uses a unique index for each
+queue in the sent message to identify a specified queue.
+
 Message types
 -

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f663e5a..ad82b5c 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/v

[Qemu-devel] [PATCH v11 5/7] vhost: introduce vhost_backend_get_vq_index method

2015-09-22 Thread Yuanhan Liu

Minusing the idx with the base(dev->vq_index) for vhost-kernel, and
then adding it back for vhost-user doesn't seem right. Here introduces
a new method vhost_backend_get_vq_index() for getting the right vq
index for following vhost messages calls.

Suggested-by: Jason Wang 
Signed-off-by: Yuanhan Liu 
Reviewed-by: Jason Wang 
---
 hw/virtio/vhost-backend.c | 10 +-
 hw/virtio/vhost-user.c| 12 ++--
 hw/virtio/vhost.c | 15 ++-
 include/hw/virtio/vhost-backend.h |  2 ++
 4 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 4d68a27..72d1392 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -42,11 +42,19 @@ static int vhost_kernel_cleanup(struct vhost_dev *dev)
 return close(fd);
 }
 
+static int vhost_kernel_get_vq_index(struct vhost_dev *dev, int idx)
+{
+assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
+return idx - dev->vq_index;
+}
+
 static const VhostOps kernel_ops = {
 .backend_type = VHOST_BACKEND_TYPE_KERNEL,
 .vhost_call = vhost_kernel_call,
 .vhost_backend_init = vhost_kernel_init,
-.vhost_backend_cleanup = vhost_kernel_cleanup
+.vhost_backend_cleanup = vhost_kernel_cleanup,
+.vhost_backend_get_vq_index = vhost_kernel_get_vq_index,
 };
 
 int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType 
backend_type)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 694fde5..5018fd6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -393,9 +393,17 @@ static int vhost_user_cleanup(struct vhost_dev *dev)
 return 0;
 }
 
+static int vhost_user_get_vq_index(struct vhost_dev *dev, int idx)
+{
+assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
+return idx;
+}
+
 const VhostOps user_ops = {
 .backend_type = VHOST_BACKEND_TYPE_USER,
 .vhost_call = vhost_user_call,
 .vhost_backend_init = vhost_user_init,
-.vhost_backend_cleanup = vhost_user_cleanup
-};
+.vhost_backend_cleanup = vhost_user_cleanup,
+.vhost_backend_get_vq_index = vhost_user_get_vq_index,
+};
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a08c36b..7a7812d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -719,7 +719,7 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 {
 hwaddr s, l, a;
 int r;
-int vhost_vq_index = idx - dev->vq_index;
+int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, idx);
 struct vhost_vring_file file = {
 .index = vhost_vq_index
 };
@@ -728,7 +728,6 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
 };
 struct VirtQueue *vvq = virtio_get_queue(vdev, idx);
 
-assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
 
 vq->num = state.num = virtio_queue_get_num(vdev, idx);
 r = dev->vhost_ops->vhost_call(dev, VHOST_SET_VRING_NUM, &state);
@@ -822,12 +821,12 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
 struct vhost_virtqueue *vq,
 unsigned idx)
 {
-int vhost_vq_index = idx - dev->vq_index;
+int vhost_vq_index = dev->vhost_ops->vhost_backend_get_vq_index(dev, idx);
 struct vhost_vring_state state = {
 .index = vhost_vq_index,
 };
 int r;
-assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
 r = dev->vhost_ops->vhost_call(dev, VHOST_GET_VRING_BASE, &state);
 if (r < 0) {
 fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
@@ -1066,17 +1065,15 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, 
VirtIODevice *vdev, int n,
 {
 struct VirtQueue *vvq = virtio_get_queue(vdev, n);
 int r, index = n - hdev->vq_index;
+struct vhost_vring_file file;
 
-assert(n >= hdev->vq_index && n < hdev->vq_index + hdev->nvqs);
-
-struct vhost_vring_file file = {
-.index = index
-};
 if (mask) {
 file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
 } else {
 file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
 }
+
+file.index = hdev->vhost_ops->vhost_backend_get_vq_index(hdev, n);
 r = hdev->vhost_ops->vhost_call(hdev, VHOST_SET_VRING_CALL, &file);
 assert(r >= 0);
 }
diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index e472f29..e1dfc6d 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -24,12 +24,14 @@ typedef int (*vhost_call)(struct vhost_dev *dev, unsigned 
long int request,
  void *arg);
 typedef int (*vhost_backend

[Qemu-devel] [PATCH v11 4/7] vhost-user: add VHOST_USER_GET_QUEUE_NUM message

2015-09-22 Thread Yuanhan Liu

This is for querying how many queues the backend supports if it has mq
support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried
protocol features).

vhost_net_get_max_queues() is the interface to export that value, and
to tell if the backend supports # of queues user requested, which is
done in the following patch.

Signed-off-by: Yuanhan Liu 

---
v11: define a dummy vhost_net_get_max_queues when !CONFIG_VHOST_NET.
---
 docs/specs/vhost-user.txt | 11 +++
 hw/net/vhost_net.c| 12 
 hw/virtio/vhost-user.c| 15 ++-
 include/hw/virtio/vhost.h |  1 +
 include/net/vhost_net.h   |  1 +
 5 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index ccbbcbb..43db9b4 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -301,3 +301,14 @@ Message types
   Bits (0-7) of the payload contain the vring index. Bit 8 is the
   invalid FD flag. This flag is set when there is no file descriptor
   in the ancillary data.
+
+ * VHOST_USER_GET_QUEUE_NUM
+
+  Id: 17
+  Equivalent ioctl: N/A
+  Master payload: N/A
+  Slave payload: u64
+
+  Query how many queues the backend supports. This request should be
+  sent only when VHOST_USER_PROTOCOL_F_MQ is set in quried protocol
+  features by VHOST_USER_GET_PROTOCOL_FEATURES.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index b7d29b7..f663e5a 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -122,6 +122,11 @@ void vhost_net_ack_features(struct vhost_net *net, 
uint64_t features)
 vhost_ack_features(&net->dev, vhost_net_get_feature_bits(net), features);
 }
 
+uint64_t vhost_net_get_max_queues(VHostNetState *net)
+{
+return net->dev.max_queues;
+}
+
 static int vhost_net_get_fd(NetClientState *backend)
 {
 switch (backend->info->type) {
@@ -144,6 +149,8 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 goto fail;
 }
 
+net->dev.max_queues = 1;
+
 if (backend_kernel) {
 r = vhost_net_get_fd(options->net_backend);
 if (r < 0) {
@@ -414,6 +421,11 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 return vhost_net;
 }
 #else
+uint64_t vhost_net_get_max_queues(VHostNetState *net)
+{
+return 1;
+}
+
 struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
 error_report("vhost-net support is not compiled in");
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 9cb2f52..694fde5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -25,7 +25,9 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
-#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
+
+#define VHOST_USER_PROTOCOL_F_MQ0
 
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
@@ -45,6 +47,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_GET_QUEUE_NUM = 17,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -211,6 +214,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
 case VHOST_USER_GET_PROTOCOL_FEATURES:
+case VHOST_USER_GET_QUEUE_NUM:
 need_reply = 1;
 break;
 
@@ -315,6 +319,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 switch (msg_request) {
 case VHOST_USER_GET_FEATURES:
 case VHOST_USER_GET_PROTOCOL_FEATURES:
+case VHOST_USER_GET_QUEUE_NUM:
 if (msg.size != sizeof(m.u64)) {
 error_report("Received bad msg size.");
 return -1;
@@ -366,6 +371,14 @@ static int vhost_user_init(struct vhost_dev *dev, void 
*opaque)
 if (err < 0) {
 return err;
 }
+
+/* query the max queues we support if backend supports Multiple Queue 
*/
+if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MQ)) {
+err = vhost_user_call(dev, VHOST_USER_GET_QUEUE_NUM, 
&dev->max_queues);
+if (err < 0) {
+return err;
+}
+}
 }
 
 return 0;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6467c73..c3758f3 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -48,6 +48,7 @@ struct vhost_dev {
 unsigned long long acked_features;
 unsigned long long backend_features;
 unsigned long long protocol_features;
+unsigned long long max_queues;
 bool started;
 bool log_enabled;
 unsigned long long log_size;
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 840d4b1..8db723e 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -13,6 +13,7 @@

[Qemu-devel] [PATCH v11 3/7] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-09-22 Thread Yuanhan Liu

Quote from Michael:

We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Yuanhan Liu 
---
 docs/specs/vhost-user.txt   | 4 ++--
 hw/net/vhost_net.c  | 2 +-
 hw/virtio/vhost-user.c  | 6 +++---
 linux-headers/linux/vhost.h | 2 +-
 tests/vhost-user-test.c | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 70da3b1..ccbbcbb 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -194,10 +194,10 @@ Message types
   as an owner of the session. This can be used on the Slave as a
   "session start" flag.
 
- * VHOST_USER_RESET_OWNER
+ * VHOST_USER_RESET_DEVICE
 
   Id: 4
-  Equivalent ioctl: VHOST_RESET_OWNER
+  Equivalent ioctl: VHOST_RESET_DEVICE
   Master payload: N/A
 
   Issued when a new connection is about to be closed. The Master will no
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9d32d76..b7d29b7 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -287,7 +287,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
 for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net->dev.vhost_ops;
-int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER,
+int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_DEVICE,
   NULL);
 assert(r >= 0);
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 7fe35c6..9cb2f52 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -32,7 +32,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_OWNER = 4,
+VHOST_USER_RESET_DEVICE = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
@@ -98,7 +98,7 @@ static unsigned long int 
ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
 VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
 VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
 VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
-VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
+VHOST_RESET_DEVICE,  /* VHOST_USER_RESET_DEVICE */
 VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
 VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
 VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
@@ -222,7 +222,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_USER_SET_OWNER:
-case VHOST_USER_RESET_OWNER:
+case VHOST_USER_RESET_DEVICE:
 break;
 
 case VHOST_USER_SET_MEM_TABLE:
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index ead86db..14a0160 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -78,7 +78,7 @@ struct vhost_memory {
 #define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
 /* Give up ownership, and reset the device to default values.
  * Allows subsequent call to VHOST_OWNER_SET to succeed. */
-#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
+#define VHOST_RESET_DEVICE _IO(VHOST_VIRTIO, 0x02)
 
 /* Set up/modify memory layout */
 #define VHOST_SET_MEM_TABLE_IOW(VHOST_VIRTIO, 0x03, struct vhost_memory)
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 75fedf0..e301db7 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -58,7 +58,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_GET_FEATURES = 1,
 VHOST_USER_SET_FEATURES = 2,
 VHOST_USER_SET_OWNER = 3,
-VHOST_USER_RESET_OWNER = 4,
+VHOST_USER_RESET_DEVICE = 4,
 VHOST_USER_SET_MEM_TABLE = 5,
 VHOST_USER_SET_LOG_BASE = 6,
 VHOST_USER_SET_LOG_FD = 7,
-- 
1.9.0

1 2 >

1 - 100 of 170 matches

Mail list logo