Re: [PATCH 12/21] vhost-vdpa: introduce uAPI to get the number of virtqueue groups

2020-12-29 Thread Jason Wang


On 2020/12/29 下午8:24, Eli Cohen wrote:

On Wed, Dec 16, 2020 at 02:48:09PM +0800, Jason Wang wrote:

Follows the vDPA support for multiple address spaces, this patch
introduce uAPI for the userspace to know the number of virtqueue
groups supported by the vDPA device.

Can you explain what exactly you mean be userspace?



It's the userspace that uses the uAPI introduced in this patch.



Is it just qemu or
is it destined to the virtio_net driver run by the qemu process?



It could be Qemu, DPDK or other userspace program.

The guest virtio-net driver will not use this but talks to the virtio 
device emulated by Qemu.




Also can you say for what purpose?



This can be used for facilitate the checking of whether the control vq 
could be supported.


E.g if the device support less than 2 groups, qemu won't advertise 
control vq.


Yes, #groups could be inferred from GET_VRING_GROUP. But it's not 
straightforward as this.


Thanks





Signed-off-by: Jason Wang 
---
  drivers/vhost/vdpa.c   | 4 
  include/uapi/linux/vhost.h | 3 +++
  2 files changed, 7 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 060d5b5b7e64..1ba5901b28e7 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -536,6 +536,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_GET_VRING_NUM:
r = vhost_vdpa_get_vring_num(v, argp);
break;
+   case VHOST_VDPA_GET_GROUP_NUM:
+   r = copy_to_user(argp, &v->vdpa->ngroups,
+sizeof(v->vdpa->ngroups));
+   break;
case VHOST_SET_LOG_BASE:
case VHOST_SET_LOG_FD:
r = -ENOIOCTLCMD;
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 59c6c0fbaba1..8a4e6e426bbf 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -145,4 +145,7 @@
  /* Get the valid iova range */
  #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
 struct vhost_vdpa_iova_range)
+/* Get the number of virtqueue groups. */
+#define VHOST_VDPA_GET_GROUP_NUM   _IOR(VHOST_VIRTIO, 0x79, unsigned int)
+
  #endif
--
2.25.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 11/21] vhost-vdpa: introduce asid based IOTLB

2020-12-29 Thread Jason Wang


On 2020/12/29 下午7:53, Eli Cohen wrote:

+
+static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 
asid)
+{
+   struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
+   struct vhost_vdpa_as *as;
+
+   if (asid_to_as(v, asid))
+   return NULL;
+
+   as = kmalloc(sizeof(*as), GFP_KERNEL);

kzalloc()? See comment below.


+   if (!as)
+   return NULL;
+
+   vhost_iotlb_init(&as->iotlb, 0, 0);
+   as->id = asid;
+   hlist_add_head(&as->hash_link, head);
+   ++v->used_as;

Although you eventually ended up removing used_as, this is a bug since
you're incrementing a random value. Maybe it's better to be on the safe
side and use kzalloc() for as above.



Yes and used_as needs to be removed.

Thanks







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 11/21] vhost-vdpa: introduce asid based IOTLB

2020-12-29 Thread Jason Wang


On 2020/12/29 下午8:05, Eli Cohen wrote:

+
+static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)

The return value is never interpreted. I think it should either be made
void or return values checked.



Right, will make it void.





+{
+   struct vhost_vdpa_as *as = asid_to_as(v, asid);
+
+   /* Remove default address space is not allowed */
+   if (asid == 0)
+   return -EINVAL;

Can you explain why? I think you have a memory leak due to this as no
one will ever free as with id 0.



Looks like a bug. Will remove this.

Thanks


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 11/21] vhost-vdpa: introduce asid based IOTLB

2020-12-29 Thread Jason Wang


On 2020/12/29 下午7:41, Eli Cohen wrote:

On Wed, Dec 16, 2020 at 02:48:08PM +0800, Jason Wang wrote:

This patch converts the vhost-vDPA device to support multiple IOTLBs
tagged via ASID via hlist. This will be used for supporting multiple
address spaces in the following patches.

Signed-off-by: Jason Wang 
---
  drivers/vhost/vdpa.c | 106 ---
  1 file changed, 80 insertions(+), 26 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index feb6a58df22d..060d5b5b7e64 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -33,13 +33,21 @@ enum {
  
  #define VHOST_VDPA_DEV_MAX (1U << MINORBITS)
  
+#define VHOST_VDPA_IOTLB_BUCKETS 16

+
+struct vhost_vdpa_as {
+   struct hlist_node hash_link;
+   struct vhost_iotlb iotlb;
+   u32 id;
+};
+
  struct vhost_vdpa {
struct vhost_dev vdev;
struct iommu_domain *domain;
struct vhost_virtqueue *vqs;
struct completion completion;
struct vdpa_device *vdpa;
-   struct vhost_iotlb *iotlb;
+   struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
struct device dev;
struct cdev cdev;
atomic_t opened;
@@ -49,12 +57,64 @@ struct vhost_vdpa {
struct eventfd_ctx *config_ctx;
int in_batch;
struct vdpa_iova_range range;
+   int used_as;

This is not really used. Not in this patch and later removed.



Right, will remove this.

Thanks


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message

2020-12-29 Thread Jason Wang


On 2020/12/29 下午6:26, Yongji Xie wrote:

On Tue, Dec 29, 2020 at 5:11 PM Jason Wang  wrote:



- Original Message -

On Mon, Dec 28, 2020 at 4:43 PM Jason Wang  wrote:


On 2020/12/28 下午4:14, Yongji Xie wrote:

I see. So all the above two questions are because VHOST_IOTLB_INVALIDATE
is expected to be synchronous. This need to be solved by tweaking the
current VDUSE API or we can re-visit to go with descriptors relaying
first.


Actually all vdpa related operations are synchronous in current
implementation. The ops.set_map/dma_map/dma_unmap should not return
until the VDUSE_UPDATE_IOTLB/VDUSE_INVALIDATE_IOTLB message is replied
by userspace. Could it solve this problem?


   I was thinking whether or not we need to generate IOTLB_INVALIDATE
message to VDUSE during dma_unmap (vduse_dev_unmap_page).

If we don't, we're probably fine.


It seems not feasible. This message will be also used in the
virtio-vdpa case to notify userspace to unmap some pages during
consistent dma unmapping. Maybe we can document it to make sure the
users can handle the message correctly.

Just to make sure I understand your point.

Do you mean you plan to notify the unmap of 1) streaming DMA or 2)
coherent DMA?

For 1) you probably need a workqueue to do that since dma unmap can
be done in irq or bh context. And if usrspace does't do the unmap, it
can still access the bounce buffer (if you don't zap pte)?


I plan to do it in the coherent DMA case.



Any reason for treating coherent DMA differently?



It's true that userspace can
access the dma buffer if userspace doesn't do the unmap. But the dma
pages would not be freed and reused unless user space called munmap()
for them.



I wonder whether or not we could recycle IOVA in this case to avoid the 
IOTLB_UMAP message.


Thanks




Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 10/21] vhost: support ASID in IOTLB API

2020-12-29 Thread Jason Wang


On 2020/12/29 下午6:20, Eli Cohen wrote:

-static int vhost_process_iotlb_msg(struct vhost_dev *dev,
+static int vhost_process_iotlb_msg(struct vhost_dev *dev, u16 asid,
   struct vhost_iotlb_msg *msg)
  {
int ret = 0;
  
+	if (asid != 0)

+   return -EINVAL;
+
mutex_lock(&dev->mutex);
vhost_dev_lock_vqs(dev);
switch (msg->type) {
@@ -1135,6 +1138,7 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev,
struct vhost_iotlb_msg msg;
size_t offset;
int type, ret;
+   u16 asid = 0;

You assume asid occupies just 16 bits. So maybe you should reserve the
other 16 bits for future extension:

struct vhost_msg_v2 {
 __u32 type;
-   __u32 reserved;
+   __u16 asid;
+   __u16 reserved;
 union {

Moreover, maybe this should be reflected in previous patches that use
the asid:

-static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb 
*iotlb)
+static int mlx5_vdpa_set_map(struct vdpa_device *vdev, u16 asid,
+struct vhost_iotlb *iotlb)

-static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
+static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u16 asid,
 struct vhost_iotlb_msg *msg)

etc.



Good catch.

This is a bug of the code actually. Since I want to stick to 32bit to be 
large enough for e.g PASID.


Will fix.

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 07/21] vdpa: multiple address spaces support

2020-12-29 Thread Jason Wang


On 2020/12/29 下午3:28, Eli Cohen wrote:

@@ -43,6 +43,8 @@ struct vdpa_vq_state {
   * @index: device index
   * @features_valid: were features initialized? for legacy guests
   * @nvqs: the number of virtqueues
+ * @ngroups: the number of virtqueue groups
+ * @nas: the number of address spaces

I am not sure these can be categorised as part of the state of the VQ.
It's more of a property so maybe we can have a callback to get the
properties of the VQ?



Or maybe there's a misunderstanding of the patch.

Those two attributes belongs to vdpa_device instead of vdpa_vq_state 
actually.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 07/21] vdpa: multiple address spaces support

2020-12-29 Thread Jason Wang


On 2020/12/29 下午3:28, Eli Cohen wrote:

@@ -43,6 +43,8 @@ struct vdpa_vq_state {
   * @index: device index
   * @features_valid: were features initialized? for legacy guests
   * @nvqs: the number of virtqueues
+ * @ngroups: the number of virtqueue groups
+ * @nas: the number of address spaces

I am not sure these can be categorised as part of the state of the VQ.
It's more of a property so maybe we can have a callback to get the
properties of the VQ?

Moreover, I think they should be handled in the hardware drivers to
return 1 for both ngroups and nas.



We can, but those are static attributes that is not expected to be 
changed by the driver.


Introduce callbacks for those static stuffs does not give us any advantage.

For group id and notification area, since we don't have a abstract of 
vdpa_virtqueue. The only choice is to introduce callbacks for them.


Maybe it's time to introduce vdpa_virtqueue.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH rfc 2/3] virtio-net: support receive timestamp

2020-12-29 Thread Willem de Bruijn
On Tue, Dec 29, 2020 at 4:23 AM Jason Wang  wrote:
>
>
>
> - Original Message -
> > On Mon, Dec 28, 2020 at 7:55 PM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, Dec 28, 2020 at 02:30:31PM -0500, Willem de Bruijn wrote:
> > > > On Mon, Dec 28, 2020 at 12:29 PM Michael S. Tsirkin 
> > > > wrote:
> > > > >
> > > > > On Mon, Dec 28, 2020 at 11:22:32AM -0500, Willem de Bruijn wrote:
> > > > > > From: Willem de Bruijn 
> > > > > >
> > > > > > Add optional PTP hardware timestamp offload for virtio-net.
> > > > > >
> > > > > > Accurate RTT measurement requires timestamps close to the wire.
> > > > > > Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
> > > > > > virtio-net header is expanded with room for a timestamp. A host may
> > > > > > pass receive timestamps for all or some packets. A timestamp is 
> > > > > > valid
> > > > > > if non-zero.
> > > > > >
> > > > > > The timestamp straddles (virtual) hardware domains. Like PTP, use
> > > > > > international atomic time (CLOCK_TAI) as global clock base. It is
> > > > > > guest responsibility to sync with host, e.g., through kvm-clock.
> > > > > >
> > > > > > Signed-off-by: Willem de Bruijn 
> > > > > > diff --git a/include/uapi/linux/virtio_net.h
> > > > > > b/include/uapi/linux/virtio_net.h
> > > > > > index f6881b5b77ee..0ffe2eeebd4a 100644
> > > > > > --- a/include/uapi/linux/virtio_net.h
> > > > > > +++ b/include/uapi/linux/virtio_net.h
> > > > > > @@ -57,6 +57,7 @@
> > > > > >* Steering */
> > > > > >  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> > > > > >
> > > > > > +#define VIRTIO_NET_F_RX_TSTAMP 55/* Host sends TAI
> > > > > > receive time */
> > > > > >  #define VIRTIO_NET_F_TX_HASH   56/* Guest sends hash report */
> > > > > >  #define VIRTIO_NET_F_HASH_REPORT  57 /* Supports hash report */
> > > > > >  #define VIRTIO_NET_F_RSS   60/* Supports RSS RX steering */
> > > > > > @@ -182,6 +183,17 @@ struct virtio_net_hdr_v1_hash {
> > > > > >   };
> > > > > >  };
> > > > > >
> > > > > > +struct virtio_net_hdr_v12 {
> > > > > > + struct virtio_net_hdr_v1 hdr;
> > > > > > + struct {
> > > > > > + __le32 value;
> > > > > > + __le16 report;
> > > > > > + __le16 flow_state;
> > > > > > + } hash;
> > > > > > + __virtio32 reserved;
> > > > > > + __virtio64 tstamp;
> > > > > > +};
> > > > > > +
> > > > > >  #ifndef VIRTIO_NET_NO_LEGACY
> > > > > >  /* This header comes first in the scatter-gather list.
> > > > > >   * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it
> > > > > >   must
> > > > >
> > > > >
> > > > > So it looks like VIRTIO_NET_F_RX_TSTAMP should depend on both
> > > > > VIRTIO_NET_F_RX_TSTAMP and VIRTIO_NET_F_HASH_REPORT then?
> > > >
> > > > Do you mean VIRTIO_NET_F_TX_TSTAMP depends on VIRTIO_NET_F_RX_TSTAMP?
> > > >
> > > > I think if either is enabled we need to enable the extended layout.
> > > > Regardless of whether TX_HASH or HASH_REPORT are enabled. If they are
> > > > not, then those fields are ignored.
> > >
> > > I mean do we waste the 8 bytes for hash if TSTAMP is set but HASH is not?
> > > If TSTAMP depends on HASH then point is moot.
> >
> > True, but the two features really are independent.
> >
> > I did consider using configurable metadata layout depending on
> > features negotiated. If there are tons of optional extensions, that
> > makes sense. But it is more complex and parsing error prone. With a
> > handful of options each of a few bytes, that did not seem worth the
> > cost to me at this point.
>
> Consider NFV workloads(64B) packet. Most fields of current vnet header
> is even a burdern.

Such workloads will not negotiate these features, I imagine.

I think this could only become an issue if a small packet workload
would want to negotiate one optional feature, without the others.

Even then, the actual cost of a few untouched bytes is negligible, as
long as the headers don't span cache-lines. I expect it to be cheaper
than having to parse a dynamic layout.

> It might be the time to introduce configurable or
> self-descriptive vnet header.

I did briefly toy with a type-val encoding. Not type-val-len, as all
options have fixed length (so far), so length can be left implicit.
But as each feature is negotiated before hand, the type can be left
implicit too. Leaving just the data elements tightly packed. As long
as order is defined.

> > And importantly, such a mode can always be added later as a separate
> > VIRTIO_NET_F_PACKED_HEADER option.
>
> What will do feature provide?

The tightly packed data elements. Strip out the elements for non
negotiated features. So if either tstamp option is negotiated, but
hash is not, exclude the 64b of hash fields. Note that I'm not
actually arguing for this (at this point).
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.

Re: [PATCH rfc 2/3] virtio-net: support receive timestamp

2020-12-29 Thread Jason Wang



- Original Message -
> On Mon, Dec 28, 2020 at 7:55 PM Michael S. Tsirkin  wrote:
> >
> > On Mon, Dec 28, 2020 at 02:30:31PM -0500, Willem de Bruijn wrote:
> > > On Mon, Dec 28, 2020 at 12:29 PM Michael S. Tsirkin 
> > > wrote:
> > > >
> > > > On Mon, Dec 28, 2020 at 11:22:32AM -0500, Willem de Bruijn wrote:
> > > > > From: Willem de Bruijn 
> > > > >
> > > > > Add optional PTP hardware timestamp offload for virtio-net.
> > > > >
> > > > > Accurate RTT measurement requires timestamps close to the wire.
> > > > > Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
> > > > > virtio-net header is expanded with room for a timestamp. A host may
> > > > > pass receive timestamps for all or some packets. A timestamp is valid
> > > > > if non-zero.
> > > > >
> > > > > The timestamp straddles (virtual) hardware domains. Like PTP, use
> > > > > international atomic time (CLOCK_TAI) as global clock base. It is
> > > > > guest responsibility to sync with host, e.g., through kvm-clock.
> > > > >
> > > > > Signed-off-by: Willem de Bruijn 
> > > > > diff --git a/include/uapi/linux/virtio_net.h
> > > > > b/include/uapi/linux/virtio_net.h
> > > > > index f6881b5b77ee..0ffe2eeebd4a 100644
> > > > > --- a/include/uapi/linux/virtio_net.h
> > > > > +++ b/include/uapi/linux/virtio_net.h
> > > > > @@ -57,6 +57,7 @@
> > > > >* Steering */
> > > > >  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> > > > >
> > > > > +#define VIRTIO_NET_F_RX_TSTAMP 55/* Host sends TAI
> > > > > receive time */
> > > > >  #define VIRTIO_NET_F_TX_HASH   56/* Guest sends hash report */
> > > > >  #define VIRTIO_NET_F_HASH_REPORT  57 /* Supports hash report */
> > > > >  #define VIRTIO_NET_F_RSS   60/* Supports RSS RX steering */
> > > > > @@ -182,6 +183,17 @@ struct virtio_net_hdr_v1_hash {
> > > > >   };
> > > > >  };
> > > > >
> > > > > +struct virtio_net_hdr_v12 {
> > > > > + struct virtio_net_hdr_v1 hdr;
> > > > > + struct {
> > > > > + __le32 value;
> > > > > + __le16 report;
> > > > > + __le16 flow_state;
> > > > > + } hash;
> > > > > + __virtio32 reserved;
> > > > > + __virtio64 tstamp;
> > > > > +};
> > > > > +
> > > > >  #ifndef VIRTIO_NET_NO_LEGACY
> > > > >  /* This header comes first in the scatter-gather list.
> > > > >   * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it
> > > > >   must
> > > >
> > > >
> > > > So it looks like VIRTIO_NET_F_RX_TSTAMP should depend on both
> > > > VIRTIO_NET_F_RX_TSTAMP and VIRTIO_NET_F_HASH_REPORT then?
> > >
> > > Do you mean VIRTIO_NET_F_TX_TSTAMP depends on VIRTIO_NET_F_RX_TSTAMP?
> > >
> > > I think if either is enabled we need to enable the extended layout.
> > > Regardless of whether TX_HASH or HASH_REPORT are enabled. If they are
> > > not, then those fields are ignored.
> >
> > I mean do we waste the 8 bytes for hash if TSTAMP is set but HASH is not?
> > If TSTAMP depends on HASH then point is moot.
> 
> True, but the two features really are independent.
> 
> I did consider using configurable metadata layout depending on
> features negotiated. If there are tons of optional extensions, that
> makes sense. But it is more complex and parsing error prone. With a
> handful of options each of a few bytes, that did not seem worth the
> cost to me at this point.

Consider NFV workloads(64B) packet. Most fields of current vnet header
is even a burdern. It might be the time to introduce configurable or
self-descriptive vnet header.


> 
> And importantly, such a mode can always be added later as a separate
> VIRTIO_NET_F_PACKED_HEADER option.

What will do feature provide?

Thanks

> 
> If anything, perhaps if we increase the virtio_net_hdr_* allocation,
> we should allocate some additional reserved space now? As each new
> size introduces quite a bit of boilerplate. Also, e.g., in qemu just
> to pass the sizes between virtio-net driver and vhost-net device.
> 
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message

2020-12-29 Thread Jason Wang


- Original Message -
> On Mon, Dec 28, 2020 at 4:43 PM Jason Wang  wrote:
> >
> >
> > On 2020/12/28 下午4:14, Yongji Xie wrote:
> > >> I see. So all the above two questions are because VHOST_IOTLB_INVALIDATE
> > >> is expected to be synchronous. This need to be solved by tweaking the
> > >> current VDUSE API or we can re-visit to go with descriptors relaying
> > >> first.
> > >>
> > > Actually all vdpa related operations are synchronous in current
> > > implementation. The ops.set_map/dma_map/dma_unmap should not return
> > > until the VDUSE_UPDATE_IOTLB/VDUSE_INVALIDATE_IOTLB message is replied
> > > by userspace. Could it solve this problem?
> >
> >
> >   I was thinking whether or not we need to generate IOTLB_INVALIDATE
> > message to VDUSE during dma_unmap (vduse_dev_unmap_page).
> >
> > If we don't, we're probably fine.
> >
> 
> It seems not feasible. This message will be also used in the
> virtio-vdpa case to notify userspace to unmap some pages during
> consistent dma unmapping. Maybe we can document it to make sure the
> users can handle the message correctly.

Just to make sure I understand your point.

Do you mean you plan to notify the unmap of 1) streaming DMA or 2)
coherent DMA?

For 1) you probably need a workqueue to do that since dma unmap can
be done in irq or bh context. And if usrspace does't do the unmap, it
can still access the bounce buffer (if you don't zap pte)?

Thanks


> 
> Thanks,
> Yongji
> 
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-mem: use false for bool variable

2020-12-29 Thread David Hildenbrand
On 29.12.20 03:44, Tian Tao wrote:
> Fixes coccicheck warnings:
> drivers/virtio/virtio_mem.c:2580:2-25: WARNING: Assignment of 0/1 to
> bool variable.
> 
> Signed-off-by: Tian Tao 
> ---
>  drivers/virtio/virtio_mem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index 9fc9ec4..85a272c 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -2577,7 +2577,7 @@ static int virtio_mem_probe(struct virtio_device *vdev)
>* actually in use (e.g., trying to reload the driver).
>*/
>   if (vm->plugged_size) {
> - vm->unplug_all_required = 1;
> + vm->unplug_all_required = true;
>   dev_info(&vm->vdev->dev, "unplugging all memory is required\n");
>   }
>  
> 

Thanks - the patch subject is a little weird ("false"). I suggest

"virtio-mem: use boolean value when setting vm->unplug_all_required"

Apart from that

Acked-by: David Hildenbrand 


@Mst, do you want a resend or can you fixup the subject? Thanks!

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 0/6] KVM: arm64: VCPU preempted check support

2020-12-29 Thread yezengruan
On 2020/1/15 22:14, Marc Zyngier wrote:
> On 2020-01-13 12:12, Will Deacon wrote:
>> [+PeterZ]
>>
>> On Thu, Dec 26, 2019 at 09:58:27PM +0800, Zengruan Ye wrote:
>>> This patch set aims to support the vcpu_is_preempted() functionality
>>> under KVM/arm64, which allowing the guest to obtain the VCPU is
>>> currently running or not. This will enhance lock performance on
>>> overcommitted hosts (more runnable VCPUs than physical CPUs in the
>>> system) as doing busy waits for preempted VCPUs will hurt system
>>> performance far worse than early yielding.
>>>
>>> We have observed some performace improvements in uninx benchmark tests.
>>>
>>> unix benchmark result:
>>>   host:  kernel 5.5.0-rc1, HiSilicon Kunpeng920, 8 CPUs
>>>   guest: kernel 5.5.0-rc1, 16 VCPUs
>>>
>>>    test-case    |    after-patch    |   before-patch
>>> +---+--
>>>  Dhrystone 2 using register variables   | 334600751.0 lps   | 335319028.3 
>>> lps
>>>  Double-Precision Whetstone | 32856.1 MWIPS | 32849.6 
>>> MWIPS
>>>  Execl Throughput   |  3662.1 lps   |  2718.0 
>>> lps
>>>  File Copy 1024 bufsize 2000 maxblocks  |    432906.4 KBps  |    158011.8 
>>> KBps
>>>  File Copy 256 bufsize 500 maxblocks    |    116023.0 KBps  | 37664.0 
>>> KBps
>>>  File Copy 4096 bufsize 8000 maxblocks  |   1432769.8 KBps  |    441108.8 
>>> KBps
>>>  Pipe Throughput    |   6405029.6 lps   |   6021457.6 
>>> lps
>>>  Pipe-based Context Switching   |    185872.7 lps   |    184255.3 
>>> lps
>>>  Process Creation   |  4025.7 lps   |  3706.6 
>>> lps
>>>  Shell Scripts (1 concurrent)   |  6745.6 lpm   |  6436.1 
>>> lpm
>>>  Shell Scripts (8 concurrent)   |   998.7 lpm   |   931.1 
>>> lpm
>>>  System Call Overhead   |   3913363.1 lps   |   3883287.8 
>>> lps
>>> +---+--
>>>  System Benchmarks Index Score  |  1835.1   |  1327.6
>>
>> Interesting, thanks for the numbers.
>>
>> So it looks like there is a decent improvement to be had from targetted vCPU
>> wakeup, but I really dislike the explicit PV interface and it's already been
>> shown to interact badly with the WFE-based polling in smp_cond_load_*().
>>
>> Rather than expose a divergent interface, I would instead like to explore an
>> improvement to smp_cond_load_*() and see how that performs before we commit
>> to something more intrusive. Marc and I looked at this very briefly in the
>> past, and the basic idea is to register all of the WFE sites with the
>> hypervisor, indicating which register contains the address being spun on
>> and which register contains the "bad" value. That way, you don't bother
>> rescheduling a vCPU if the value at the address is still bad, because you
>> know it will exit immediately.
>>
>> Of course, the devil is in the details because when I say "address", that's
>> a guest virtual address, so you need to play some tricks in the hypervisor
>> so that you have a separate mapping for the lockword (it's enough to keep
>> track of the physical address).
>>
>> Our hacks are here but we basically ran out of time to work on them beyond
>> an unoptimised and hacky prototype:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pvcy
>>
>> Marc -- how would you prefer to handle this?
> 
> Let me try and rebase this thing to a modern kernel (I doubt it applies 
> without
> conflicts to mainline). We can then have discussion about its merit on the 
> list
> once I post it. It'd be good to have a pointer to the benchamrks that have 
> been
> used here.
> 
> Thanks,
> 
>     M.


Hi Marc, Will,

My apologies for the slow reply. Just checking what is the latest on this
PV cond yield prototype?

https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/pvcy

The following are the unixbench test results of PV cond yield prototype:

unix benchmark result:
  host:  kernel 5.10.0-rc6, HiSilicon Kunpeng920, 8 CPUs
  guest: kernel 5.10.0-rc6, 16 VCPUs
   | 5.10.0-rc6 | pv_cond_yield | 
vcpu_is_preempted
 System Benchmarks Index Values|INDEX   |  INDEX|  INDEX
---++---+---
 Dhrystone 2 using register variables  |  29164.0   |29156.9|29207.2
 Double-Precision Whetstone|   6807.6   | 6789.2| 6912.1
 Execl Throughput  |856.7   | 1195.6|  863.1
 File Copy 1024 bufsize 2000 maxblocks |189.9   |  923.5| 1094.2
 File Copy 256 bufsize 500 maxblocks   |121.9   |  578.4|  588.7
 File Copy 4096 bufsize 8000 maxblocks |419.9   | 1992.0| 2733.7
 Pipe