Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue

2023-05-09 Thread Michael S. Tsirkin
On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin  wrote:
> >
> > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > Forget to cc netdev, adding.
> > >
> > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > response since current code is executed under addr spin lock.
> > > > >
> > > > > Signed-off-by: Jason Wang 
> > > >
> > > > I don't like this frankly. This means that setting RX mode which would
> > > > previously be reliable, now becomes unreliable.
> > >
> > > It is "unreliable" by design:
> > >
> > >   void(*ndo_set_rx_mode)(struct net_device *dev);
> > >
> > > > - first of all configuration is no longer immediate
> > >
> > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > >
> > > mlx5e, ipoib, efx, ...
> > >
> > > >   and there is no way for driver to find out when
> > > >   it actually took effect
> > >
> > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > survive from this for years.
> > >
> > > > - second, if device fails command, this is also not
> > > >   propagated to driver, again no way for driver to find out
> > > >
> > > > VDUSE needs to be fixed to do tricks to fix this
> > > > without breaking normal drivers.
> > >
> > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > transport).
> > >
> > > Thanks
> >
> > Hmm. Can we differentiate between these use-cases?
> 
> It doesn't look easy since we are drivers for virtio bus. Underlayer
> details were hidden from virtio-net.
> 
> Or do you have any ideas on this?
> 
> Thanks

I don't know, pass some kind of flag in struct virtqueue?
"bool slow; /* This vq can be very slow sometimes. Don't wait for it! 
*/"

?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization

2023-05-09 Thread Jason Wang


在 2023/5/9 09:43, Xuan Zhuo 写道:

On Mon, 8 May 2023 11:00:10 -0400, Feng Liu  wrote:


On 2023-05-07 p.m.9:45, Xuan Zhuo wrote:

External email: Use caution opening links or attachments


On Sat, 6 May 2023 08:08:02 -0400, Feng Liu  wrote:


On 2023-05-05 p.m.10:33, Xuan Zhuo wrote:

External email: Use caution opening links or attachments


On Tue, 2 May 2023 20:35:25 -0400, Feng Liu  wrote:

When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract a helper function of disable queue pairs, and use newly
introduced helper function in error unwinding and virtnet_close;

Issue: 3383038
Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu 
Reviewed-by: William Tu 
Reviewed-by: Parav Pandit 
Reviewed-by: Simon Horman 
Acked-by: Michael S. Tsirkin 
Change-Id: Ib4c6a97cb7b837cfa484c593dd43a435c47ea68f
---
drivers/net/virtio_net.c | 30 --
1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8d8038538fc4..3737cf120cb7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1868,6 +1868,13 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
 return received;
}

+static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)
+{
+ virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
+ napi_disable(&vi->rq[qp_index].napi);
+ xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+}
+
static int virtnet_open(struct net_device *dev)
{
 struct virtnet_info *vi = netdev_priv(dev);
@@ -1883,20 +1890,26 @@ static int virtnet_open(struct net_device *dev)

 err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
 if (err < 0)
- return err;
+ goto err_xdp_info_reg;

 err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
  MEM_TYPE_PAGE_SHARED, NULL);
- if (err < 0) {
- xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
- return err;
- }
+ if (err < 0)
+ goto err_xdp_reg_mem_model;

 virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
 virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
 }

 return 0;
+
+err_xdp_reg_mem_model:
+ xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
+err_xdp_info_reg:
+ for (i = i - 1; i >= 0; i--)
+ virtnet_disable_qp(vi, i);


I would to know should we handle for these:

   disable_delayed_refill(vi);
   cancel_delayed_work_sync(&vi->refill);


Maybe we should call virtnet_close() with "i" directly.

Thanks.



Can’t use i directly here, because if xdp_rxq_info_reg fails, napi has
not been enabled for current qp yet, I should roll back from the queue
pairs where napi was enabled before(i--), otherwise it will hang at napi
disable api

This is not the point, the key is whether we should handle with:

disable_delayed_refill(vi);
cancel_delayed_work_sync(&vi->refill);

Thanks.



OK, get the point. Thanks for your careful review. And I check the code
again.

There are two points that I need to explain:

1. All refill delay work calls(vi->refill, vi->refill_enabled) are based
on that the virtio interface is successfully opened, such as
virtnet_receive, virtnet_rx_resize, _virtnet_set_queues, etc. If there
is an error in the xdp reg here, it will not trigger these subsequent
functions. There is no need to call disable_delayed_refill() and
cancel_delayed_work_sync().

Maybe something is wrong. I think these lines may call delay work.

static int virtnet_open(struct net_device *dev)
{
struct virtnet_info *vi = netdev_priv(dev);
int i, err;

enable_delayed_refill(vi);

for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
/* Make sure we have some buffers: if oom use wq. */
-->  if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
-->  schedule_delayed_work(&vi->refill, 0);

err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
if (err < 0)
return err;

err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
 MEM_TYPE_PAGE_SHARED, NULL);
if (err < 0) {
xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
return err;
}

virtnet_napi_enable(vi->rq[i].vq, &vi-

Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 12:04:50PM +0800, Jason Wang wrote:
> On Wed, May 10, 2023 at 11:44 AM Michael S. Tsirkin  wrote:
> >
> > On Wed, May 10, 2023 at 11:26:54AM +0800, Xuan Zhuo wrote:
> > > On Wed, 10 May 2023 10:54:37 +0800, zhenwei pi  
> > > wrote:
> > > > Both split ring and packed ring use 32bits to describe the length of
> > > > a descriptor: see struct vring_desc and struct vring_packed_desc.
> > > > This means the max segment size supported by virtio is U32_MAX.
> > > >
> > > > An example of virtio_max_dma_size in virtio_blk.c:
> > > >   u32 v, max_size;
> > > >
> > > >   max_size = virtio_max_dma_size(vdev);  -> implicit convert
> > > >   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
> > > >  struct virtio_blk_config, size_max, &v);
> > > >   max_size = min(max_size, v);
> > > >
> > > > There is a risk during implicit convert here, once virtio_max_dma_size
> > > > returns 4G, max_size becomes 0.
> > > >
> > > > Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
> > > > Cc: Joerg Roedel 
> > > > Signed-off-by: zhenwei pi 
> > > > ---
> > > >  drivers/virtio/virtio_ring.c | 12 
> > > >  include/linux/virtio.h   |  2 +-
> > > >  2 files changed, 9 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index c5310eaf8b46..55cfecf030a1 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct 
> > > > virtio_device *vdev)
> > > > return false;
> > > >  }
> > > >
> > > > -size_t virtio_max_dma_size(const struct virtio_device *vdev)
> > > > +u32 virtio_max_dma_size(const struct virtio_device *vdev)
> > >
> > >
> > > LGTM
> > >
> > > But, should we change the parameter to vq, then use the dma_dev?
> > >
> > > @Jason
> > >
> > > Thanks.
> > >
> >
> >
> > that would be an unrelated rework.
> 
> Probably, but I think it's better to be done on top otherwise we may forget.
> 
> Thanks

Just to make things clear I'm merging fixes for this
release but cleanups belong in the next one.

> >
> > > >  {
> > > > -   size_t max_segment_size = SIZE_MAX;
> > > > +   u32 max_segment_size = U32_MAX;
> > > >
> > > > -   if (vring_use_dma_api(vdev))
> > > > -   max_segment_size = dma_max_mapping_size(vdev->dev.parent);
> > > > +   if (vring_use_dma_api(vdev)) {
> > > > +   size_t max_dma_size = 
> > > > dma_max_mapping_size(vdev->dev.parent);
> > > > +
> > > > +   if (max_dma_size < max_segment_size)
> > > > +   max_segment_size = max_dma_size;
> > > > +   }
> > > >
> > > > return max_segment_size;
> > > >  }
> > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > index b93238db94e3..1a605f408329 100644
> > > > --- a/include/linux/virtio.h
> > > > +++ b/include/linux/virtio.h
> > > > @@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device 
> > > > *dev);
> > > >  #endif
> > > >  void virtio_reset_device(struct virtio_device *dev);
> > > >
> > > > -size_t virtio_max_dma_size(const struct virtio_device *vdev);
> > > > +u32 virtio_max_dma_size(const struct virtio_device *vdev);
> > > >
> > > >  #define virtio_device_for_each_vq(vdev, vq) \
> > > > list_for_each_entry(vq, &vdev->vqs, list)
> > > > --
> > > > 2.20.1
> > > >
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread Jason Wang
On Wed, May 10, 2023 at 11:44 AM Michael S. Tsirkin  wrote:
>
> On Wed, May 10, 2023 at 11:26:54AM +0800, Xuan Zhuo wrote:
> > On Wed, 10 May 2023 10:54:37 +0800, zhenwei pi  
> > wrote:
> > > Both split ring and packed ring use 32bits to describe the length of
> > > a descriptor: see struct vring_desc and struct vring_packed_desc.
> > > This means the max segment size supported by virtio is U32_MAX.
> > >
> > > An example of virtio_max_dma_size in virtio_blk.c:
> > >   u32 v, max_size;
> > >
> > >   max_size = virtio_max_dma_size(vdev);  -> implicit convert
> > >   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
> > >  struct virtio_blk_config, size_max, &v);
> > >   max_size = min(max_size, v);
> > >
> > > There is a risk during implicit convert here, once virtio_max_dma_size
> > > returns 4G, max_size becomes 0.
> > >
> > > Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
> > > Cc: Joerg Roedel 
> > > Signed-off-by: zhenwei pi 
> > > ---
> > >  drivers/virtio/virtio_ring.c | 12 
> > >  include/linux/virtio.h   |  2 +-
> > >  2 files changed, 9 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index c5310eaf8b46..55cfecf030a1 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct 
> > > virtio_device *vdev)
> > > return false;
> > >  }
> > >
> > > -size_t virtio_max_dma_size(const struct virtio_device *vdev)
> > > +u32 virtio_max_dma_size(const struct virtio_device *vdev)
> >
> >
> > LGTM
> >
> > But, should we change the parameter to vq, then use the dma_dev?
> >
> > @Jason
> >
> > Thanks.
> >
>
>
> that would be an unrelated rework.

Probably, but I think it's better to be done on top otherwise we may forget.

Thanks

>
> > >  {
> > > -   size_t max_segment_size = SIZE_MAX;
> > > +   u32 max_segment_size = U32_MAX;
> > >
> > > -   if (vring_use_dma_api(vdev))
> > > -   max_segment_size = dma_max_mapping_size(vdev->dev.parent);
> > > +   if (vring_use_dma_api(vdev)) {
> > > +   size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
> > > +
> > > +   if (max_dma_size < max_segment_size)
> > > +   max_segment_size = max_dma_size;
> > > +   }
> > >
> > > return max_segment_size;
> > >  }
> > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > index b93238db94e3..1a605f408329 100644
> > > --- a/include/linux/virtio.h
> > > +++ b/include/linux/virtio.h
> > > @@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
> > >  #endif
> > >  void virtio_reset_device(struct virtio_device *dev);
> > >
> > > -size_t virtio_max_dma_size(const struct virtio_device *vdev);
> > > +u32 virtio_max_dma_size(const struct virtio_device *vdev);
> > >
> > >  #define virtio_device_for_each_vq(vdev, vq) \
> > > list_for_each_entry(vq, &vdev->vqs, list)
> > > --
> > > 2.20.1
> > >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH v2 1/3] PCI: endpoint: introduce a helper to implement pci ep virtio function

2023-05-09 Thread Shunsuke Mie
I'll fix the typo and some styles you mentioned.
In this e-mail, I reply to the other comments.
2023年4月28日(金) 3:28 Bjorn Helgaas :
>
> Simple typos, don't repost until there's more substantive feedback.
>
> On Thu, Apr 27, 2023 at 07:44:26PM +0900, Shunsuke Mie wrote:
> > The Linux PCIe Endpoint framework supports to implement PCIe endpoint
> > functions using a PCIe controller operating in endpoint mode.
> > It is possble to realize the behavior of PCIe device, such as virtio PCI
> > device. This patch introduces a setof helper functions and data structures
> > to implement a PCIe endpoint function that behaves as a virtio device.
>
> s/possble/possible/
> s/setof/set of/
>
> > Those functions enable the implementation PCIe endpoint function that
> > comply with the virtio lecacy specification. Because modern virtio
> > specifications require devices to implement custom PCIe capabilities, which
> > are not currently supported by either PCIe controllers/drivers or the PCIe
> > endpoint framework.
>
> s/implementation PCIe endpoint function/implementation of PCIe endpoint 
> functions/
> s/lecacy/legacy/ (What does "legacy" mean?  Is there a spec for this?)
Yes, it is defined in spec.
> I guess "legacy virtio" devices need not implement custom PCIe
> capabilities, but "modern virtio" devices must implement them?
That I wanted to explain here.
> Capitalize "Endpoint framework" consistently; sometimes it's
> "Endpoint", other times it's "endpoint".
I'll take care to be consistent.
> > While this patch provides functions for negotiating with host drivers and
> > copying data, each PCIe function driver must impl ement operations that
> > depend on each specific device, such as network, block, etc.
>
> s/impl ement/implement/
>
> > +#include 
> > +#include 
> > +#include 
>
> Typically the header includes would be alphabetized if possible.
I'll try to reorder them.
> > + vq_virt = pci_epc_map_addr(epf->epc, epf->func_no, epf->vfunc_no,
> > +vq_pci_addr, vq_phys, vq_size);
> > + if (IS_ERR(vq_virt)) {
> > + pr_err("Failed to map virtuqueue to local");
>
> s/virtuqueue/virtqueue/
>
> I know you probably don't have any way to use dev_err(), but this
> message really needs some context, like a driver name and instance or
> something.
I'll try to use the dev_* function appropriately If possible.
> > +#define VIRTIO_PCI_LEGACY_CFG_BAR 0
>
> What makes this a "legacy cfg BAR"?  Is there some spec that covers
> virtio BAR usage?
Yes. Virtio Legacy interface defines the PCI BAR number to use.
https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-14500010

> > + * epf_virtio_init - initialize struct epf_virtio and setup BAR for virtio
> > + * @evio: struct epf_virtio to initialize.
> > + * @hdr: pci configuration space to show remote host.
> > + * @bar_size: pci BAR size it depends on the virtio device type.
>
> s/pci/PCI/ (there were also a few more instances above in messages or
> comments)
>
> > + * epf_virtio_final - finalize struct epf_virtio. it frees bar and memories
> > + * @evio: struct epf_virtio to finalize.
>
> s/bar/BAR/
>
> > +static void epf_virtio_monitor_qnotify(struct epf_virtio *evio)
> > +{
> > + const u16 qn_default = evio->nvq;
> > + u16 tmp;
> > +
> > + /* Since there is no way to synchronize between the host and EP 
> > functions,
> > +  * it is possible to miss multiple notifications.
>
> Multi-line comment style.
>
> > + err = epf_virtio_negotiate_vq(evio);
> > + if (err < 0) {
> > + pr_err("failed to negoticate configs with driver\n");
>
> s/negoticate/negotiate/
>
> > + * epf_virtio_reset - reset virtio status
>
> Some of the function descriptions end with a period (".") and others
> don't.  Please figure out what the most common style is and use that
> consistently.
I agree. I'll fix to be consistent.
> > + dst = pci_epc_map_addr(epf->epc, epf->func_no,
> > +epf->vfunc_no, dbase, &phys,
> > +slen);
> > + if (IS_ERR(dst)) {
> > + pr_err("failed to map pci mmoery spact to 
> > local\n");
>
> s/pci/PCI/
> s/mmoery/memory/
> s/spact/space/ ?
>
> Also below.
>
> IIRC some previous messages started with a capital letter.  Please
> make them all consistent.
Sure.
> > + if (dir == DMA_MEM_TO_DEV) {
> > + pci_epc_unmap_addr(epf->epc, epf->func_no,
> > +epf->vfunc_no, phys, dst, slen);
> > + } else {
> > + pci_epc_unmap_addr(epf->epc, epf->func_no,
> > +epf->vfunc_no, phys, src, slen);
> > + }
> > + }
> > +
> > + return 1;
>
> I guess this function returns either a negative error code or the
> value 1?  That seems sort of weird (I think "negative error code or
> *zero* is more t

Re: Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread zhenwei pi via Virtualization




On 5/10/23 11:39, Michael S. Tsirkin wrote:

On Wed, May 10, 2023 at 10:54:37AM +0800, zhenwei pi wrote:

Both split ring and packed ring use 32bits to describe the length of
a descriptor: see struct vring_desc and struct vring_packed_desc.
This means the max segment size supported by virtio is U32_MAX.

An example of virtio_max_dma_size in virtio_blk.c:
   u32 v, max_size;

   max_size = virtio_max_dma_size(vdev);  -> implicit convert
   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
  struct virtio_blk_config, size_max, &v);
   max_size = min(max_size, v);

There is a risk during implicit convert here, once virtio_max_dma_size
returns 4G, max_size becomes 0.

Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
Cc: Joerg Roedel 
Signed-off-by: zhenwei pi 



is this a theoretical concern or do you manage to trigger this
somehow?



I never hit any issue about this, I notice here during diving into the 
symbols exported by virtio_ring.ko.



---
  drivers/virtio/virtio_ring.c | 12 
  include/linux/virtio.h   |  2 +-
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c5310eaf8b46..55cfecf030a1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct virtio_device 
*vdev)
return false;
  }
  
-size_t virtio_max_dma_size(const struct virtio_device *vdev)

+u32 virtio_max_dma_size(const struct virtio_device *vdev)
  {
-   size_t max_segment_size = SIZE_MAX;
+   u32 max_segment_size = U32_MAX;
  
-	if (vring_use_dma_api(vdev))

-   max_segment_size = dma_max_mapping_size(vdev->dev.parent);
+   if (vring_use_dma_api(vdev)) {
+   size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
+
+   if (max_dma_size < max_segment_size)
+   max_segment_size = max_dma_size;
+   }
  
  	return max_segment_size;

  }
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index b93238db94e3..1a605f408329 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
  #endif
  void virtio_reset_device(struct virtio_device *dev);
  
-size_t virtio_max_dma_size(const struct virtio_device *vdev);

+u32 virtio_max_dma_size(const struct virtio_device *vdev);
  
  #define virtio_device_for_each_vq(vdev, vq) \

list_for_each_entry(vq, &vdev->vqs, list)
--
2.20.1




--
zhenwei pi
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 10:54:37AM +0800, zhenwei pi wrote:
> Both split ring and packed ring use 32bits to describe the length of
> a descriptor: see struct vring_desc and struct vring_packed_desc.
> This means the max segment size supported by virtio is U32_MAX.
> 
> An example of virtio_max_dma_size in virtio_blk.c:
>   u32 v, max_size;
> 
>   max_size = virtio_max_dma_size(vdev);  -> implicit convert
>   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
>  struct virtio_blk_config, size_max, &v);
>   max_size = min(max_size, v);
> 
> There is a risk during implicit convert here, once virtio_max_dma_size
> returns 4G, max_size becomes 0.
> 
> Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
> Cc: Joerg Roedel 
> Signed-off-by: zhenwei pi 


is this a theoretical concern or do you manage to trigger this
somehow?

> ---
>  drivers/virtio/virtio_ring.c | 12 
>  include/linux/virtio.h   |  2 +-
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index c5310eaf8b46..55cfecf030a1 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct 
> virtio_device *vdev)
>   return false;
>  }
>  
> -size_t virtio_max_dma_size(const struct virtio_device *vdev)
> +u32 virtio_max_dma_size(const struct virtio_device *vdev)
>  {
> - size_t max_segment_size = SIZE_MAX;
> + u32 max_segment_size = U32_MAX;
>  
> - if (vring_use_dma_api(vdev))
> - max_segment_size = dma_max_mapping_size(vdev->dev.parent);
> + if (vring_use_dma_api(vdev)) {
> + size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
> +
> + if (max_dma_size < max_segment_size)
> + max_segment_size = max_dma_size;
> + }
>  
>   return max_segment_size;
>  }
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index b93238db94e3..1a605f408329 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
>  #endif
>  void virtio_reset_device(struct virtio_device *dev);
>  
> -size_t virtio_max_dma_size(const struct virtio_device *vdev);
> +u32 virtio_max_dma_size(const struct virtio_device *vdev);
>  
>  #define virtio_device_for_each_vq(vdev, vq) \
>   list_for_each_entry(vq, &vdev->vqs, list)
> -- 
> 2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 11:26:54AM +0800, Xuan Zhuo wrote:
> On Wed, 10 May 2023 10:54:37 +0800, zhenwei pi  
> wrote:
> > Both split ring and packed ring use 32bits to describe the length of
> > a descriptor: see struct vring_desc and struct vring_packed_desc.
> > This means the max segment size supported by virtio is U32_MAX.
> >
> > An example of virtio_max_dma_size in virtio_blk.c:
> >   u32 v, max_size;
> >
> >   max_size = virtio_max_dma_size(vdev);  -> implicit convert
> >   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
> >  struct virtio_blk_config, size_max, &v);
> >   max_size = min(max_size, v);
> >
> > There is a risk during implicit convert here, once virtio_max_dma_size
> > returns 4G, max_size becomes 0.
> >
> > Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
> > Cc: Joerg Roedel 
> > Signed-off-by: zhenwei pi 
> > ---
> >  drivers/virtio/virtio_ring.c | 12 
> >  include/linux/virtio.h   |  2 +-
> >  2 files changed, 9 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index c5310eaf8b46..55cfecf030a1 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct 
> > virtio_device *vdev)
> > return false;
> >  }
> >
> > -size_t virtio_max_dma_size(const struct virtio_device *vdev)
> > +u32 virtio_max_dma_size(const struct virtio_device *vdev)
> 
> 
> LGTM
> 
> But, should we change the parameter to vq, then use the dma_dev?
> 
> @Jason
> 
> Thanks.
> 


that would be an unrelated rework.

> >  {
> > -   size_t max_segment_size = SIZE_MAX;
> > +   u32 max_segment_size = U32_MAX;
> >
> > -   if (vring_use_dma_api(vdev))
> > -   max_segment_size = dma_max_mapping_size(vdev->dev.parent);
> > +   if (vring_use_dma_api(vdev)) {
> > +   size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
> > +
> > +   if (max_dma_size < max_segment_size)
> > +   max_segment_size = max_dma_size;
> > +   }
> >
> > return max_segment_size;
> >  }
> > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > index b93238db94e3..1a605f408329 100644
> > --- a/include/linux/virtio.h
> > +++ b/include/linux/virtio.h
> > @@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
> >  #endif
> >  void virtio_reset_device(struct virtio_device *dev);
> >
> > -size_t virtio_max_dma_size(const struct virtio_device *vdev);
> > +u32 virtio_max_dma_size(const struct virtio_device *vdev);
> >
> >  #define virtio_device_for_each_vq(vdev, vq) \
> > list_for_each_entry(vq, &vdev->vqs, list)
> > --
> > 2.20.1
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread Xuan Zhuo
On Wed, 10 May 2023 10:54:37 +0800, zhenwei pi  wrote:
> Both split ring and packed ring use 32bits to describe the length of
> a descriptor: see struct vring_desc and struct vring_packed_desc.
> This means the max segment size supported by virtio is U32_MAX.
>
> An example of virtio_max_dma_size in virtio_blk.c:
>   u32 v, max_size;
>
>   max_size = virtio_max_dma_size(vdev);  -> implicit convert
>   err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
>  struct virtio_blk_config, size_max, &v);
>   max_size = min(max_size, v);
>
> There is a risk during implicit convert here, once virtio_max_dma_size
> returns 4G, max_size becomes 0.
>
> Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
> Cc: Joerg Roedel 
> Signed-off-by: zhenwei pi 
> ---
>  drivers/virtio/virtio_ring.c | 12 
>  include/linux/virtio.h   |  2 +-
>  2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index c5310eaf8b46..55cfecf030a1 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct 
> virtio_device *vdev)
>   return false;
>  }
>
> -size_t virtio_max_dma_size(const struct virtio_device *vdev)
> +u32 virtio_max_dma_size(const struct virtio_device *vdev)


LGTM

But, should we change the parameter to vq, then use the dma_dev?

@Jason

Thanks.


>  {
> - size_t max_segment_size = SIZE_MAX;
> + u32 max_segment_size = U32_MAX;
>
> - if (vring_use_dma_api(vdev))
> - max_segment_size = dma_max_mapping_size(vdev->dev.parent);
> + if (vring_use_dma_api(vdev)) {
> + size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
> +
> + if (max_dma_size < max_segment_size)
> + max_segment_size = max_dma_size;
> + }
>
>   return max_segment_size;
>  }
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index b93238db94e3..1a605f408329 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
>  #endif
>  void virtio_reset_device(struct virtio_device *dev);
>
> -size_t virtio_max_dma_size(const struct virtio_device *vdev);
> +u32 virtio_max_dma_size(const struct virtio_device *vdev);
>
>  #define virtio_device_for_each_vq(vdev, vq) \
>   list_for_each_entry(vq, &vdev->vqs, list)
> --
> 2.20.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v2 2/3] virtio_pci: add a definition of queue flag in ISR

2023-05-09 Thread Shunsuke Mie
2023年5月8日(月) 12:59 Jason Wang :
>
> On Thu, Apr 27, 2023 at 6:44 PM Shunsuke Mie  wrote:
> >
> > Already it has beed defined a config changed flag of ISR, but not the queue
>
> Typo should be "been".
Thanks for pointing that out.
> > flag. Add a macro for it.
> >
> > Signed-off-by: Shunsuke Mie 
>
> Other than this,
>
> Acked-by: Jason Wang 
>
> > ---
> >  include/uapi/linux/virtio_pci.h | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/include/uapi/linux/virtio_pci.h 
> > b/include/uapi/linux/virtio_pci.h
> > index f703afc7ad31..d405bea27240 100644
> > --- a/include/uapi/linux/virtio_pci.h
> > +++ b/include/uapi/linux/virtio_pci.h
> > @@ -94,6 +94,9 @@
> >
> >  #endif /* VIRTIO_PCI_NO_LEGACY */
> >
> > +/* Bits for ISR status field:only when if MSI-X is disabled */
> > +/* The bit of the ISR which indicates a queue entry update. */
> > +#define VIRTIO_PCI_ISR_QUEUE   0x1
> >  /* The bit of the ISR which indicates a device configuration change. */
> >  #define VIRTIO_PCI_ISR_CONFIG  0x2
> >  /* Vector value used to disable MSI for queue */
> > --
> > 2.25.1
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH v2 3/3] PCI: endpoint: Add EP function driver to provide virtio-console functionality

2023-05-09 Thread Shunsuke Mie
Hi Json,
2023年5月8日(月) 13:03 Jason Wang :
>
> On Thu, Apr 27, 2023 at 6:44 PM Shunsuke Mie  wrote:
> >
> > Add a new PCIe endpoint function driver that works as a pci virtio-console
> > device. The console connect to endpoint side console. It enables to
> > communicate PCIe host and endpoint.
> >
> > Architecture is following:
> >
> >  ┌┐ ┌──┬┐
> >  │virtioe │ │  │virtio  │
> >  │console drv │ ├───┐  │console drv │
> >  ├┤ │(virtio console│  ├┤
> >  │ virtio bus │ │ device)   │◄►│ virtio bus │
> >  ├┤ ├---┤  └┤
> >  ││ │ pci ep virtio │   │
> >  │  pci bus   │ │  console drv  │   │
> >  ││  pcie   ├───┤   │
> >  ││ ◄─► │  pci ep Bus   │   │
> >  └┘ └───┴───┘
> >PCIe Root  PCIe Endpoint
> >
>
> I think it might only works for peer devices like:
>
> net, console or vsock.
Could you tell me what "peer devices" means?

> So there're many choices here, I'd like to know what's the reason for
> you to implement a mediation.
>
> An alternative is to implement a dedicated net, console and vsock
> driver for vringh (CAIF somehow works like this). This would have
> better performance.
Does it mean that the driver also functions as a network driver directly?
>
>
> > This driver has two roles. The first is as a PCIe endpoint virtio console
> > function, which is implemented using the PCIe endpoint framework and PCIe
> > EP virtio helpers. The second is as a virtual virtio console device
> > connected to the virtio bus on PCIe endpoint Linux.
> >
> > Communication between the two is achieved by copying the virtqueue data
> > between PCIe root and endpoint, respectively.
> >
> > This is a simple implementation and does not include features of
> > virtio-console such as MULTIPORT, EMERG_WRITE, etc. As a result, each
> > virtio console driver only displays /dev/hvc0.
> >
> > As an example of usage, by setting getty to /dev/hvc0, it is possible to
> > login to another host.
> >
> > Signed-off-by: Shunsuke Mie 
> > ---
> > Changes from v2:
> > - Change to use copy functions between kiovs of pci-epf-virtio.
> >
> >  drivers/pci/endpoint/functions/Kconfig|  12 +
> >  drivers/pci/endpoint/functions/Makefile   |   1 +
> >  drivers/pci/endpoint/functions/pci-epf-vcon.c | 596 ++
> >  3 files changed, 609 insertions(+)
> >  create mode 100644 drivers/pci/endpoint/functions/pci-epf-vcon.c
> >
> > diff --git a/drivers/pci/endpoint/functions/Kconfig 
> > b/drivers/pci/endpoint/functions/Kconfig
> > index fa1a6a569a8f..9ce2698b67e1 100644
> > --- a/drivers/pci/endpoint/functions/Kconfig
> > +++ b/drivers/pci/endpoint/functions/Kconfig
> > @@ -44,3 +44,15 @@ config PCI_EPF_VIRTIO
> > select VHOST_RING_IOMEM
> > help
> >   Helpers to implement PCI virtio Endpoint function
> > +
> > +config PCI_EPF_VCON
> > +   tristate "PCI Endpoint virito-console driver"
> > +   depends on PCI_ENDPOINT
> > +   select VHOST_RING
> > +   select PCI_EPF_VIRTIO
> > +   help
> > + PCIe Endpoint virtio-console function implementatino. This module
> > + enables to show the virtio-console as pci device to PCIe host 
> > side, and
> > + another virtual virtio-console device registers to endpoint 
> > system.
> > + Those devices are connected virtually and can communicate each 
> > other.
> > +
> > diff --git a/drivers/pci/endpoint/functions/Makefile 
> > b/drivers/pci/endpoint/functions/Makefile
> > index a96f127ce900..b4056689ce33 100644
> > --- a/drivers/pci/endpoint/functions/Makefile
> > +++ b/drivers/pci/endpoint/functions/Makefile
> > @@ -7,3 +7,4 @@ obj-$(CONFIG_PCI_EPF_TEST)  += pci-epf-test.o
> >  obj-$(CONFIG_PCI_EPF_NTB)  += pci-epf-ntb.o
> >  obj-$(CONFIG_PCI_EPF_VNTB) += pci-epf-vntb.o
> >  obj-$(CONFIG_PCI_EPF_VIRTIO)   += pci-epf-virtio.o
> > +obj-$(CONFIG_PCI_EPF_VCON) += pci-epf-vcon.o
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vcon.c 
> > b/drivers/pci/endpoint/functions/pci-epf-vcon.c
> > new file mode 100644
> > index ..31f4247cd10f
> > --- /dev/null
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vcon.c
> > @@ -0,0 +1,596 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * PCI Endpoint function driver to impliment virtio-console device
> > + * functionality.
> > + */
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "pci-epf-virtio.h"
> > +
> > +static int virtio_queue_size = 0x100;
> > +module_param(virtio_queue_size, int, 0444);
> > +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
> > +
> > +

[PATCH] virtio_ring: use u32 for virtio_max_dma_size

2023-05-09 Thread zhenwei pi via Virtualization
Both split ring and packed ring use 32bits to describe the length of
a descriptor: see struct vring_desc and struct vring_packed_desc.
This means the max segment size supported by virtio is U32_MAX.

An example of virtio_max_dma_size in virtio_blk.c:
  u32 v, max_size;

  max_size = virtio_max_dma_size(vdev);  -> implicit convert
  err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
 struct virtio_blk_config, size_max, &v);
  max_size = min(max_size, v);

There is a risk during implicit convert here, once virtio_max_dma_size
returns 4G, max_size becomes 0.

Fixes: e6d6dd6c875e ("virtio: Introduce virtio_max_dma_size()")
Cc: Joerg Roedel 
Signed-off-by: zhenwei pi 
---
 drivers/virtio/virtio_ring.c | 12 
 include/linux/virtio.h   |  2 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c5310eaf8b46..55cfecf030a1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -289,12 +289,16 @@ static bool vring_use_dma_api(const struct virtio_device 
*vdev)
return false;
 }
 
-size_t virtio_max_dma_size(const struct virtio_device *vdev)
+u32 virtio_max_dma_size(const struct virtio_device *vdev)
 {
-   size_t max_segment_size = SIZE_MAX;
+   u32 max_segment_size = U32_MAX;
 
-   if (vring_use_dma_api(vdev))
-   max_segment_size = dma_max_mapping_size(vdev->dev.parent);
+   if (vring_use_dma_api(vdev)) {
+   size_t max_dma_size = dma_max_mapping_size(vdev->dev.parent);
+
+   if (max_dma_size < max_segment_size)
+   max_segment_size = max_dma_size;
+   }
 
return max_segment_size;
 }
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index b93238db94e3..1a605f408329 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -147,7 +147,7 @@ int virtio_device_restore(struct virtio_device *dev);
 #endif
 void virtio_reset_device(struct virtio_device *dev);
 
-size_t virtio_max_dma_size(const struct virtio_device *vdev);
+u32 virtio_max_dma_size(const struct virtio_device *vdev);
 
 #define virtio_device_for_each_vq(vdev, vq) \
list_for_each_entry(vq, &vdev->vqs, list)
-- 
2.20.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v2 3/3] PCI: endpoint: Add EP function driver to provide virtio-console functionality

2023-05-09 Thread Shunsuke Mie
Hi Bjorn,
Thanks for the many comments. I will fix the mannerisms and typos as noted.

2023年4月28日(金) 3:09 Bjorn Helgaas :
>
> Random typos and trivial things.  No need to repost until somebody
> does a more substantive review.
>
> On Thu, Apr 27, 2023 at 07:44:28PM +0900, Shunsuke Mie wrote:
> > Add a new PCIe endpoint function driver that works as a pci virtio-console
> > device. The console connect to endpoint side console. It enables to
> > communicate PCIe host and endpoint.
>
> s/pci/PCI/
>
> > Architecture is following:
> >
> >  ┌┐ ┌──┬┐
> >  │virtioe │ │  │virtio  │
> >  │console drv │ ├───┐  │console drv │
> >  ├┤ │(virtio console│  ├┤
> >  │ virtio bus │ │ device)   │◄►│ virtio bus │
> >  ├┤ ├---┤  └┤
> >  ││ │ pci ep virtio │   │
> >  │  pci bus   │ │  console drv  │   │
> >  ││  pcie   ├───┤   │
> >  ││ ◄─► │  pci ep Bus   │   │
> >  └┘ └───┴───┘
> >PCIe Root  PCIe Endpoint
>
> s/virtioe/virtio/
> s/pci/PCI/
> s/pcie/PCIe/
> s/ep/EP/
>
> > +config PCI_EPF_VCON
> > + tristate "PCI Endpoint virito-console driver"
>
> s/virito/virtio/
>
> > + depends on PCI_ENDPOINT
> > + select VHOST_RING
> > + select PCI_EPF_VIRTIO
> > + help
> > +   PCIe Endpoint virtio-console function implementatino. This module
> > +   enables to show the virtio-console as pci device to PCIe host side, 
> > and
> > +   another virtual virtio-console device registers to endpoint system.
> > +   Those devices are connected virtually and can communicate each 
> > other.
>
> s/implementatino/implementation/
> s/pci/PCI/
> s/can communicate/can communicate with/
>
> > + * PCI Endpoint function driver to impliment virtio-console device
> > + * functionality.
>
> s/impliment/implement/
>
> > +static int virtio_queue_size = 0x100;
> > +module_param(virtio_queue_size, int, 0444);
> > +MODULE_PARM_DESC(virtio_queue_size, "A length of virtqueue");
>
> When and why would users need this parameter?  Where is it documented?
>
> > + /* To access virtqueus of local host driver */
>
> s/virtqueus/virtqueues/
>
> > + /* To show a status whether this driver is ready and the remote is 
> > connected */
>
> Make this fit in 80 columns.
>
> > + /* This is a minimum implementation. Doesn't support any features of
> > +  * virtio console. It means driver and device use just 2 virtuque for 
> > tx
> > +  * and rx.
> > +  */
>
> Use common multi-line comment style:
>
>   /*
>* This is ...
>*/
I'll follow the style.
> s/virtuque/virtqueues/
>
> > +static void epf_vcon_raise_irq_handler(struct work_struct *work)
> > +{
> > + struct epf_vcon *vcon =
> > + container_of(work, struct epf_vcon, raise_irq_work);
>
> Rewrap.
>
> > +static int epf_vcon_setup_common(struct epf_vcon *vcon)
> > +{
> > + vcon->features = 0;
> > + vcon->connected = false;
> > +
> > + vcon->task_wq =
> > + alloc_workqueue("pci-epf-vcon/task-wq",
>
> Looks like this would fit on the previous line?
>
> > + WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0);
>
> > +static void epf_vcon_initialize_complete(void *param)
> > +{
> > + struct epf_vcon *vcon = param;
> > +
> > + pr_debug("Remote host has connected\n");
>
> Is there any device info you could include here, e.g., with dev_dbg()?
> It's nice if users have a little context.
I see. I'll use it.
> > +static int epf_vcon_setup_ep_func(struct epf_vcon *vcon, struct pci_epf 
> > *epf)
> > +{
> > + int err;
> > + struct epf_virtio *evio = &vcon->evio;
> > + unsigned int nvq = epf_vcon_get_nvq(vcon);
> > +
> > + vcon->rdev_iovs =
> > + kmalloc_array(nvq, sizeof(vcon->rdev_iovs[0]), GFP_KERNEL);
>
> Move the function name and as many parameters as fit in 80 columns to
> the previous line to match prevailing style.
I've just applied clang-format... Ok, I'll fix it manually.
> > + /* There is no config for virtio console because this console device
> > +  * doesn't any support features
> > +  */
>
> Multi-line comment style.
>
> s/doesn't any support/doesn't support any/?  Is that what you mean?
>
> > + /* Do nothing because this console device doesn't any support 
> > features */
>
> Same.
>
> > +static void epf_vcon_vdev_set_status(struct virtio_device *vdev, u8 status)
> > +{
> > + if (status & VIRTIO_CONFIG_S_FAILED)
> > + pr_debug("driver failed to setup this device\n");
>
> dev_dbg() if possible.
>
> > + err = vringh_init_kern(&vcon->vdev_vrhs[i], vcon->features,
> > +virtio_queue_size, false,

Re: [RFC PATCH 2/2] virtio-balloon: Add Working Set reporting

2023-05-09 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 02:54:19AM +0800, Yuanchu Xie wrote:
> diff --git a/include/uapi/linux/virtio_balloon.h 
> b/include/uapi/linux/virtio_balloon.h
> index ddaa45e723c4..06d0683d8d8c 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h

Any changes to this have to be documented in the virtio spec
and be sent to virtio TC.

> @@ -37,6 +37,7 @@
>  #define VIRTIO_BALLOON_F_FREE_PAGE_HINT  3 /* VQ to report free pages */
>  #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
>  #define VIRTIO_BALLOON_F_REPORTING   5 /* Page reporting virtqueue */
> +#define VIRTIO_BALLOON_F_WS_REPORTING 6 /* Working Set Size reporting */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
> @@ -59,6 +60,8 @@ struct virtio_balloon_config {
>   };
>   /* Stores PAGE_POISON if page poisoning is in use */
>   __le32 poison_val;
> + /* Number of bins for Working Set report if in use. */
> + __le32 ws_num_bins;

working_set_ pls. eschew abbreviation.
Really __le32? Is 4G bins reasonable? what if it's 0?

>  };
>  
>  #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> @@ -116,4 +119,22 @@ struct virtio_balloon_stat {
>   __virtio64 val;
>  } __attribute__((packed));
>  
> +enum virtio_balloon_ws_op {
> + VIRTIO_BALLOON_WS_REQUEST = 1,
> + VIRTIO_BALLOON_WS_CONFIG = 2,
> +};


what's this?

> +
> +struct virtio_balloon_ws {

document fields.

> +#define VIRTIO_BALLOON_WS_RECLAIMABLE 0
> +#define VIRTIO_BALLOON_WS_DISCARDABLE 1

what are these?

> + /* TODO: Provide additional detail on memory, e.g. reclaimable. */

Well? If we don't now hypervisors will come to depend on
this being broken.

> + __virtio16 tag;
> + /* TODO: Support per-NUMA node reports. */

Same. This is ABI we can't merge with unaddressed TODO items.

> + __virtio16 node_id;
> + uint8_t reserved[4];
> + __virtio64 idle_age_ms;
> + /* Track separately for ANON_AND_FILE. */

What does this mean?

> + __virtio64 memory_size_bytes[2];




> +};
> +
>  #endif /* _LINUX_VIRTIO_BALLOON_H */

Use LE for new features please.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net v5] virtio_net: Fix error unwinding of XDP initialization

2023-05-09 Thread Feng Liu via Virtualization
When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract helper functions of disable and enable queue pairs.
Use newly introduced disable helper function in error unwinding and
virtnet_close. Use enable helper function in virtnet_open.

Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu 
Reviewed-by: Jiri Pirko 
Reviewed-by: William Tu 
---
v4 -> v5
feedbacks from Michael S. Tsirkin
- rename helper as virtnet_disable_queue_pair
- rename helper as virtnet_enable_queue_pair

v3 -> v4
feedbacks from Jiri Pirko
- Add symmetric helper function virtnet_enable_qp to enable queues.
- Error handle:  cleanup current queue pair in virtnet_enable_qp,
  and complete the reset queue pairs cleanup in virtnet_open.
- Fix coding style.
feedbacks from Parav Pandit
- Remove redundant debug message and white space.

v2 -> v3
feedbacks from Michael S. Tsirkin
- Remove redundant comment.

v1 -> v2
feedbacks from Michael S. Tsirkin
- squash two patches together.

---
 drivers/net/virtio_net.c | 58 
 1 file changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8d8038538fc4..664def938111 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1868,6 +1868,38 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
return received;
 }
 
+static void virtnet_disable_queue_pair(struct virtnet_info *vi, int qp_index)
+{
+   virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
+   napi_disable(&vi->rq[qp_index].napi);
+   xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+}
+
+static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
+{
+   struct net_device *dev = vi->dev;
+   int err;
+
+   err = xdp_rxq_info_reg(&vi->rq[qp_index].xdp_rxq, dev, qp_index,
+  vi->rq[qp_index].napi.napi_id);
+   if (err < 0)
+   return err;
+
+   err = xdp_rxq_info_reg_mem_model(&vi->rq[qp_index].xdp_rxq,
+MEM_TYPE_PAGE_SHARED, NULL);
+   if (err < 0)
+   goto err_xdp_reg_mem_model;
+
+   virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
+   virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
+
+   return 0;
+
+err_xdp_reg_mem_model:
+   xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+   return err;
+}
+
 static int virtnet_open(struct net_device *dev)
 {
struct virtnet_info *vi = netdev_priv(dev);
@@ -1881,22 +1913,17 @@ static int virtnet_open(struct net_device *dev)
if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
schedule_delayed_work(&vi->refill, 0);
 
-   err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
+   err = virtnet_enable_queue_pair(vi, i);
if (err < 0)
-   return err;
-
-   err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
-MEM_TYPE_PAGE_SHARED, NULL);
-   if (err < 0) {
-   xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
-   return err;
-   }
-
-   virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
-   virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
+   goto err_enable_qp;
}
 
return 0;
+
+err_enable_qp:
+   for (i--; i >= 0; i--)
+   virtnet_disable_queue_pair(vi, i);
+   return err;
 }
 
 static int virtnet_poll_tx(struct napi_struct *napi, int budget)
@@ -2305,11 +2332,8 @@ static int virtnet_close(struct net_device *dev)
/* Make sure refill_work doesn't re-enable napi! */
cancel_delayed_work_sync(&vi->refill);
 
-   for (i = 0; i < vi->max_queue_pairs; i++) {
-   virtnet_napi_tx_disable(&vi->sq[i].napi);
-   napi_disable(&vi->rq[i].napi);
-   xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
-   }
+   for (i = 0; i < vi->max_queue_pairs; i++)
+   virtnet_disable_queue_pair(vi, i);
 
return 0;
 }
-- 
2.37.1 (Apple Git-137.1)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v4] virtio_net: Fix error unwinding of XDP initialization

2023-05-09 Thread Feng Liu via Virtualization




On 2023-05-09 a.m.12:42, Michael S. Tsirkin wrote:

External email: Use caution opening links or attachments


On Mon, May 08, 2023 at 06:27:08PM -0400, Feng Liu wrote:

When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract helper functions of disable and enable queue pairs.
Use newly introduced disable helper function in error unwinding and
virtnet_close. Use enable helper function in virtnet_open.

Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu 
Reviewed-by: Jiri Pirko 
---

v3 -> v4
feedbacks from Jiri Pirko
- Add symmetric helper function virtnet_enable_qp to enable queues.
- Error handle:  cleanup current queue pair in virtnet_enable_qp,
   and complete the reset queue pairs cleanup in virtnet_open.
- Fix coding style.
feedbacks from Parav Pandit
- Remove redundant debug message and white space.

v2 -> v3
feedbacks from Michael S. Tsirkin
- Remove redundant comment.

v1 -> v2
feedbacks from Michael S. Tsirkin
- squash two patches together.

---
  drivers/net/virtio_net.c | 58 
  1 file changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8d8038538fc4..df7c08048fa7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1868,6 +1868,38 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
   return received;
  }

+static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)



I am guessing _qp stands for queue pair? Let's call it
virtnet_disable_queue_pair please, consistently with max_queue_pairs.


Yes, qp stands for queue pair
will do, thanks


+{
+ virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
+ napi_disable(&vi->rq[qp_index].napi);
+ xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+}
+
+static int virtnet_enable_qp(struct virtnet_info *vi, int qp_index)


Similarly, virtnet_enable_queue_pair


will do, thanks


+{
+ struct net_device *dev = vi->dev;
+ int err;
+
+ err = xdp_rxq_info_reg(&vi->rq[qp_index].xdp_rxq, dev, qp_index,
+vi->rq[qp_index].napi.napi_id);
+ if (err < 0)
+ return err;
+
+ err = xdp_rxq_info_reg_mem_model(&vi->rq[qp_index].xdp_rxq,
+  MEM_TYPE_PAGE_SHARED, NULL);
+ if (err < 0)
+ goto err_xdp_reg_mem_model;
+
+ virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
+ virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
+
+ return 0;
+
+err_xdp_reg_mem_model:
+ xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+ return err;
+}
+
  static int virtnet_open(struct net_device *dev)
  {
   struct virtnet_info *vi = netdev_priv(dev);
@@ -1881,22 +1913,17 @@ static int virtnet_open(struct net_device *dev)
   if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
   schedule_delayed_work(&vi->refill, 0);

- err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
+ err = virtnet_enable_qp(vi, i);
   if (err < 0)
- return err;
-
- err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
-  MEM_TYPE_PAGE_SHARED, NULL);
- if (err < 0) {
- xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
- return err;
- }
-
- virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
- virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
+ goto err_enable_qp;
   }

   return 0;
+
+err_enable_qp:
+ for (i--; i >= 0; i--)
+ virtnet_disable_qp(vi, i);
+ return err;
  }

  static int virtnet_poll_tx(struct napi_struct *napi, int budget)
@@ -2305,11 +2332,8 @@ static int virtnet_close(struct net_device *dev)
   /* Make sure refill_work doesn't re-enable napi! */
   cancel_delayed_work_sync(&vi->refill);

- for (i = 0; i < vi->max_queue_pairs; i++) {
- virtnet_napi_tx_disable(&vi->sq[i].napi);
- napi_disable(&vi->rq[i].napi);
- xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
- }
+ for (i = 0; i < vi->max_queue_pairs; i++)
+ virtnet_disable_qp(vi, i);

   return 0;
  }
--
2.37.1 (Apple Git-137.1)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/3] vhost: support PACKED when setting-getting vring_base

2023-05-09 Thread Stefano Garzarella

On Mon, Apr 24, 2023 at 03:50:30PM -0700, Shannon Nelson via Virtualization 
wrote:

Use the right structs for PACKED or split vqs when setting and
getting the vring base.

Signed-off-by: Shannon Nelson 
---
drivers/vhost/vhost.c | 18 +-
drivers/vhost/vhost.h |  8 ++--
2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f11bdbe4c2c5..f64efda48f21 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1633,17 +1633,25 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned 
int ioctl, void __user *arg
r = -EFAULT;
break;
}
-   if (s.num > 0x) {
-   r = -EINVAL;
-   break;
+   if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
+   vq->last_avail_idx = s.num & 0x;
+   vq->last_used_idx = (s.num >> 16) & 0x;
+   } else {
+   if (s.num > 0x) {
+   r = -EINVAL;
+   break;
+   }
+   vq->last_avail_idx = s.num;
}
-   vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
-   s.num = vq->last_avail_idx;
+   if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
+   s.num = (u32)vq->last_avail_idx | ((u32)vq->last_used_idx 
<< 16);
+   else
+   s.num = vq->last_avail_idx;


The changes LGTM, but since we are changing the UAPI, should we
update the documentation of VHOST_SET_VRING_BASE and
VHOST_GET_VRING_BASE in include/uapi/linux/vhost.h?

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 11/12] virtio_ring: separate the logic of reset/enable from virtqueue_resize

2023-05-09 Thread Xuan Zhuo
The subsequent reset function will reuse these logic.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 58 
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index e063ac764db6..58b1bf76341b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2183,6 +2183,43 @@ static int virtqueue_resize_packed(struct virtqueue 
*_vq, u32 num)
return -ENOMEM;
 }
 
+static int virtqueue_disable_and_recycle(struct virtqueue *_vq,
+void (*recycle)(struct virtqueue *vq, 
void *buf))
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   struct virtio_device *vdev = vq->vq.vdev;
+   void *buf;
+   int err;
+
+   if (!vq->we_own_ring)
+   return -EPERM;
+
+   if (!vdev->config->disable_vq_and_reset)
+   return -ENOENT;
+
+   if (!vdev->config->enable_vq_after_reset)
+   return -ENOENT;
+
+   err = vdev->config->disable_vq_and_reset(_vq);
+   if (err)
+   return err;
+
+   while ((buf = virtqueue_detach_unused_buf(_vq)) != NULL)
+   recycle(_vq, buf);
+
+   return 0;
+}
+
+static int virtqueue_enable_after_reset(struct virtqueue *_vq)
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   struct virtio_device *vdev = vq->vq.vdev;
+
+   if (vdev->config->enable_vq_after_reset(_vq))
+   return -EBUSY;
+
+   return 0;
+}
 
 /*
  * Generic functions and exported symbols.
@@ -2751,13 +2788,8 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
 void (*recycle)(struct virtqueue *vq, void *buf))
 {
struct vring_virtqueue *vq = to_vvq(_vq);
-   struct virtio_device *vdev = vq->vq.vdev;
-   void *buf;
int err;
 
-   if (!vq->we_own_ring)
-   return -EPERM;
-
if (num > vq->vq.num_max)
return -E2BIG;
 
@@ -2767,28 +2799,16 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
if ((vq->packed_ring ? vq->packed.vring.num : vq->split.vring.num) == 
num)
return 0;
 
-   if (!vdev->config->disable_vq_and_reset)
-   return -ENOENT;
-
-   if (!vdev->config->enable_vq_after_reset)
-   return -ENOENT;
-
-   err = vdev->config->disable_vq_and_reset(_vq);
+   err = virtqueue_disable_and_recycle(_vq, recycle);
if (err)
return err;
 
-   while ((buf = virtqueue_detach_unused_buf(_vq)) != NULL)
-   recycle(_vq, buf);
-
if (vq->packed_ring)
err = virtqueue_resize_packed(_vq, num);
else
err = virtqueue_resize_split(_vq, num);
 
-   if (vdev->config->enable_vq_after_reset(_vq))
-   return -EBUSY;
-
-   return err;
+   return virtqueue_enable_after_reset(_vq);
 }
 EXPORT_SYMBOL_GPL(virtqueue_resize);
 
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 12/12] virtio_ring: introduce virtqueue_reset()

2023-05-09 Thread Xuan Zhuo
Introduce virtqueue_reset() to release all buffer inside vq.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 33 +
 include/linux/virtio.h   |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 58b1bf76341b..589d1376785a 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2812,6 +2812,39 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
 }
 EXPORT_SYMBOL_GPL(virtqueue_resize);
 
+/**
+ * virtqueue_reset - detach and recycle all unused buffers
+ * @_vq: the struct virtqueue we're talking about.
+ * @recycle: callback to recycle unused buffers
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ *
+ * Returns zero or a negative error.
+ * 0: success.
+ * -EBUSY: Failed to sync with device, vq may not work properly
+ * -ENOENT: Transport or device not supported
+ * -EPERM: Operation not permitted
+ */
+int virtqueue_reset(struct virtqueue *_vq,
+   void (*recycle)(struct virtqueue *vq, void *buf))
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   int err;
+
+   err = virtqueue_disable_and_recycle(_vq, recycle);
+   if (err)
+   return err;
+
+   if (vq->packed_ring)
+   virtqueue_reinit_packed(vq);
+   else
+   virtqueue_reinit_split(vq);
+
+   return virtqueue_enable_after_reset(_vq);
+}
+EXPORT_SYMBOL_GPL(virtqueue_reset);
+
 /* Only available for split ring */
 struct virtqueue *vring_new_virtqueue(unsigned int index,
  unsigned int num,
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5ebef5d50f04..bb449b857829 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -97,6 +97,8 @@ dma_addr_t virtqueue_get_used_addr(const struct virtqueue 
*vq);
 
 int virtqueue_resize(struct virtqueue *vq, u32 num,
 void (*recycle)(struct virtqueue *vq, void *buf));
+int virtqueue_reset(struct virtqueue *vq,
+   void (*recycle)(struct virtqueue *vq, void *buf));
 
 /**
  * struct virtio_device - representation of a device using virtio
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 02/12] virtio_ring: packed: separate dma codes

2023-05-09 Thread Xuan Zhuo
DMA-related logic is separated from the virtqueue_add_packed(). DMA
address will be saved as sg->dma_address, then virtqueue_add_packed()
will use it directly. Unmap operation will be simpler.

The purpose of this is to facilitate subsequent support to receive
dma address mapped by drivers.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 39 +---
 1 file changed, 9 insertions(+), 30 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index f243fea8987b..e5fa97c91f03 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1448,9 +1448,9 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
struct vring_virtqueue *vq = to_vvq(_vq);
struct vring_packed_desc *desc;
struct scatterlist *sg;
-   unsigned int i, n, c, descs_used, err_idx;
+   unsigned int i, n, c, descs_used;
__le16 head_flags, flags;
-   u16 head, id, prev, curr, avail_used_flags;
+   u16 head, id, prev, curr;
int err;
 
START_USE(vq);
@@ -1479,7 +1479,6 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
}
 
head = vq->packed.next_avail_idx;
-   avail_used_flags = vq->packed.avail_used_flags;
 
WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect);
 
@@ -1497,15 +1496,15 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
id = vq->free_head;
BUG_ON(id == vq->packed.vring.num);
 
+   if (virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, in_sgs)) {
+   END_USE(vq);
+   return -EIO;
+   }
+
curr = id;
c = 0;
for (n = 0; n < out_sgs + in_sgs; n++) {
for (sg = sgs[n]; sg; sg = sg_next(sg)) {
-   dma_addr_t addr = vring_map_one_sg(vq, sg, n < out_sgs ?
-   DMA_TO_DEVICE : DMA_FROM_DEVICE);
-   if (vring_mapping_error(vq, addr))
-   goto unmap_release;
-
flags = cpu_to_le16(vq->packed.avail_used_flags |
(++c == total_sg ? 0 : VRING_DESC_F_NEXT) |
(n < out_sgs ? 0 : VRING_DESC_F_WRITE));
@@ -1514,12 +1513,12 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
else
desc[i].flags = flags;
 
-   desc[i].addr = cpu_to_le64(addr);
+   desc[i].addr = cpu_to_le64(vring_sg_address(sg));
desc[i].len = cpu_to_le32(sg->length);
desc[i].id = cpu_to_le16(id);
 
if (unlikely(vq->use_dma_api)) {
-   vq->packed.desc_extra[curr].addr = addr;
+   vq->packed.desc_extra[curr].addr = 
vring_sg_address(sg);
vq->packed.desc_extra[curr].len = sg->length;
vq->packed.desc_extra[curr].flags =
le16_to_cpu(flags);
@@ -1565,26 +1564,6 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
END_USE(vq);
 
return 0;
-
-unmap_release:
-   err_idx = i;
-   i = head;
-   curr = vq->free_head;
-
-   vq->packed.avail_used_flags = avail_used_flags;
-
-   for (n = 0; n < total_sg; n++) {
-   if (i == err_idx)
-   break;
-   vring_unmap_extra_packed(vq, &vq->packed.desc_extra[curr]);
-   curr = vq->packed.desc_extra[curr].next;
-   i++;
-   if (i >= vq->packed.vring.num)
-   i = 0;
-   }
-
-   END_USE(vq);
-   return -EIO;
 }
 
 static bool virtqueue_kick_prepare_packed(struct virtqueue *_vq)
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 09/12] virtio_ring: introduce virtqueue_dma_dev()

2023-05-09 Thread Xuan Zhuo
Added virtqueue_dma_dev() to get DMA device for virtio. Then the
caller can do dma operation in advance. The purpose is to keep memory
mapped across multiple add/get buf operations.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 17 +
 include/linux/virtio.h   |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index b7c9de0ae554..a0cad2977a61 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2328,6 +2328,23 @@ int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
 }
 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_ctx);
 
+/**
+ * virtqueue_dma_dev - get the dma dev
+ * @_vq: the struct virtqueue we're talking about.
+ *
+ * Returns the dma dev. That can been used for dma api.
+ */
+struct device *virtqueue_dma_dev(struct virtqueue *_vq)
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+
+   if (vq->use_dma_api)
+   return vring_dma_dev(vq);
+   else
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(virtqueue_dma_dev);
+
 /**
  * virtqueue_kick_prepare - first half of split virtqueue_kick call.
  * @_vq: the struct virtqueue
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index b93238db94e3..5ebef5d50f04 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -61,6 +61,8 @@ int virtqueue_add_sgs(struct virtqueue *vq,
  void *data,
  gfp_t gfp);
 
+struct device *virtqueue_dma_dev(struct virtqueue *vq);
+
 bool virtqueue_kick(struct virtqueue *vq);
 
 bool virtqueue_kick_prepare(struct virtqueue *vq);
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 10/12] virtio_ring: correct the expression of the description of virtqueue_resize()

2023-05-09 Thread Xuan Zhuo
Modify the "useless" to a more accurate "unused".

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index a0cad2977a61..e063ac764db6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2727,7 +2727,7 @@ EXPORT_SYMBOL_GPL(vring_create_virtqueue_dma);
  * virtqueue_resize - resize the vring of vq
  * @_vq: the struct virtqueue we're talking about.
  * @num: new ring num
- * @recycle: callback for recycle the useless buffer
+ * @recycle: callback to recycle unused buffers
  *
  * When it is really necessary to create a new vring, it will set the current 
vq
  * into the reset state. Then call the passed callback to recycle the buffer
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 08/12] virtio_ring: update document for virtqueue_add_*

2023-05-09 Thread Xuan Zhuo
Update the document of virtqueue_add_* series API, allowing the callers to
use sg->dma_address to pass the dma address to Virtio Core.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index eb207da13735..b7c9de0ae554 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2217,6 +2217,11 @@ static inline int virtqueue_add(struct virtqueue *_vq,
  * Caller must ensure we don't call this with other virtqueue operations
  * at the same time (except where noted).
  *
+ * If the caller has done dma map then use sg->dma_address to pass dma address.
+ * If one sg->dma_address is used, then all sgs must use sg->dma_address;
+ * otherwise all sg->dma_address must be NULL. If sg->dma_address is used,
+ * sg_page(sg) must be null.
+ *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
  */
 int virtqueue_add_sgs(struct virtqueue *_vq,
@@ -2251,6 +2256,11 @@ EXPORT_SYMBOL_GPL(virtqueue_add_sgs);
  * Caller must ensure we don't call this with other virtqueue operations
  * at the same time (except where noted).
  *
+ * If the caller has done dma map then use sg->dma_address to pass dma address.
+ * If one sg->dma_address is used, then all sgs must use sg->dma_address;
+ * otherwise all sg->dma_address must be NULL. If sg->dma_address is used,
+ * sg_page(sg) must be null.
+ *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
  */
 int virtqueue_add_outbuf(struct virtqueue *vq,
@@ -2273,6 +2283,11 @@ EXPORT_SYMBOL_GPL(virtqueue_add_outbuf);
  * Caller must ensure we don't call this with other virtqueue operations
  * at the same time (except where noted).
  *
+ * If the caller has done dma map then use sg->dma_address to pass dma address.
+ * If one sg->dma_address is used, then all sgs must use sg->dma_address;
+ * otherwise all sg->dma_address must be NULL. If sg->dma_address is used,
+ * sg_page(sg) must be null.
+ *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
  */
 int virtqueue_add_inbuf(struct virtqueue *vq,
@@ -2296,6 +2311,11 @@ EXPORT_SYMBOL_GPL(virtqueue_add_inbuf);
  * Caller must ensure we don't call this with other virtqueue operations
  * at the same time (except where noted).
  *
+ * If the caller has done dma map then use sg->dma_address to pass dma address.
+ * If one sg->dma_address is used, then all sgs must use sg->dma_address;
+ * otherwise all sg->dma_address must be NULL. If sg->dma_address is used,
+ * sg_page(sg) must be null.
+ *
  * Returns zero or a negative error (ie. ENOSPC, ENOMEM, EIO).
  */
 int virtqueue_add_inbuf_ctx(struct virtqueue *vq,
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 00/12] virtio core prepares for AF_XDP

2023-05-09 Thread Xuan Zhuo
## About DMA APIs

Now, virtio may can not work with DMA APIs when virtio features do not have
VIRTIO_F_ACCESS_PLATFORM.

1. I tried to let DMA APIs return phy address by virtio-device. But DMA APIs 
just
   work with the "real" devices.
2. I tried to let xsk support callballs to get phy address from virtio-net
   driver as the dma address. But the maintainers of xsk may want to use dma-buf
   to replace the DMA APIs. I think that may be a larger effort. We will wait
   too long.

So rethinking this, firstly, we can support premapped-dma only for devices with
VIRTIO_F_ACCESS_PLATFORM. In the case of af-xdp, if the users want to use it,
they have to update the device to support VIRTIO_F_RING_RESET, and they can also
enable the device's VIRTIO_F_ACCESS_PLATFORM feature.

Thanks for the help from Christoph.

=

XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
copy feature of xsk (XDP socket) needs to be supported by the driver. The
performance of zero copy is very good.

ENV: Qemu with vhost.

   vhost cpu | Guest APP CPU |Guest Softirq CPU | PPS
-|---|--|
xmit by sockperf: 90%|   100%|  |  318967
xmit by xsk:  100%   |   30% |   33%| 1192064
recv by sockperf: 100%   |   68% |   100%   |  692288
recv by xsk:  100%   |   33% |   43%|  771670

Before achieving the function of Virtio-Net, we also have to let virtio core
support these features:

1. virtio core support premapped
2. virtio core support reset per-queue
3. introduce DMA APIs to virtio core

Please review.

Thanks.

v8:
 1. vring_sg_address: check by sg_page(sg) not dma_address. Because 0 is a 
valid dma address
 2. remove unused code from vring_map_one_sg()

v7:
 1. virtqueue_dma_dev() return NULL when virtio is without DMA API.

v6:
 1. change the size of the flags to u32.

v5:
 1. fix for error handler
 2. add flags to record internal dma mapping

v4:
 1. rename map_inter to dma_map_internal
 2. fix: Excess function parameter 'vq' description in 'virtqueue_dma_dev'

v3:
 1. add map_inter to struct desc state to reocrd whether virtio core do dma map

v2:
 1. based on sgs[0]->dma_address to judgment is premapped
 2. based on extra.addr to judgment to do unmap for no-indirect desc
 3. based on indir_desc to judgment to do unmap for indirect desc
 4. rename virtqueue_get_dma_dev to virtqueue_dma_dev

v1:
 1. expose dma device. NO introduce the api for dma and sync
 2. split some commit for review.


Xuan Zhuo (12):
  virtio_ring: split: separate dma codes
  virtio_ring: packed: separate dma codes
  virtio_ring: packed-indirect: separate dma codes
  virtio_ring: split: support premapped
  virtio_ring: packed: support premapped
  virtio_ring: packed-indirect: support premapped
  virtio_ring: remove unused code
  virtio_ring: update document for virtqueue_add_*
  virtio_ring: introduce virtqueue_dma_dev()
  virtio_ring: correct the expression of the description of
virtqueue_resize()
  virtio_ring: separate the logic of reset/enable from virtqueue_resize
  virtio_ring: introduce virtqueue_reset()

 drivers/virtio/virtio_ring.c | 375 +--
 include/linux/virtio.h   |   4 +
 2 files changed, 272 insertions(+), 107 deletions(-)

--
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 07/12] virtio_ring: remove unused code

2023-05-09 Thread Xuan Zhuo
Now, vring_map_one_sg() is only called by virtqueue_map_sgs().
And !use_dma_api is check before vring_map_one_sg(), so we not need to
check !use_dma_api inside vring_map_one_sg().

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c185354fcac6..eb207da13735 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -364,16 +364,6 @@ static dma_addr_t vring_map_one_sg(const struct 
vring_virtqueue *vq,
   struct scatterlist *sg,
   enum dma_data_direction direction)
 {
-   if (!vq->use_dma_api) {
-   /*
-* If DMA is not used, KMSAN doesn't know that the scatterlist
-* is initialized by the hardware. Explicitly check/unpoison it
-* depending on the direction.
-*/
-   kmsan_handle_dma(sg_page(sg), sg->offset, sg->length, 
direction);
-   return (dma_addr_t)sg_phys(sg);
-   }
-
/*
 * We can't use dma_map_sg, because we don't use scatterlists in
 * the way it expects (we don't guarantee that the scatterlist
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 05/12] virtio_ring: packed: support premapped

2023-05-09 Thread Xuan Zhuo
virtio core only supports virtual addresses, dma is completed in virtio
core.

In some scenarios (such as the AF_XDP), the memory is allocated
and DMA mapping is completed in advance, so it is necessary for us to
support passing the DMA address to virtio core.

Drives can use sg->dma_address to pass the mapped dma address to virtio
core. If one sg->dma_address is used then all sgs must use
sg->dma_address, otherwise all must be null when passing it to the APIs
of virtio.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 670b268a10ee..ccf6fd3c066a 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -81,6 +81,7 @@ struct vring_desc_state_packed {
struct vring_packed_desc *indir_desc; /* Indirect descriptor, if any. */
u16 num;/* Descriptor list length. */
u16 last;   /* The last desc state in a list. */
+   u32 flags;  /* State flags. */
 };
 
 struct vring_desc_extra {
@@ -1281,7 +1282,8 @@ static u16 packed_last_used(u16 last_used_idx)
 }
 
 static void vring_unmap_extra_packed(const struct vring_virtqueue *vq,
-const struct vring_desc_extra *extra)
+const struct vring_desc_extra *extra,
+bool dma_map_internal)
 {
u16 flags;
 
@@ -1296,6 +1298,9 @@ static void vring_unmap_extra_packed(const struct 
vring_virtqueue *vq,
 (flags & VRING_DESC_F_WRITE) ?
 DMA_FROM_DEVICE : DMA_TO_DEVICE);
} else {
+   if (!dma_map_internal)
+   return;
+
dma_unmap_page(vring_dma_dev(vq),
   extra->addr, extra->len,
   (flags & VRING_DESC_F_WRITE) ?
@@ -1462,6 +1467,7 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
unsigned int i, n, c, descs_used;
__le16 head_flags, flags;
u16 head, id, prev, curr;
+   bool dma_map_internal;
int err;
 
START_USE(vq);
@@ -1507,7 +1513,8 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
id = vq->free_head;
BUG_ON(id == vq->packed.vring.num);
 
-   if (virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, in_sgs)) {
+   dma_map_internal = !!sg_page(sgs[0]);
+   if (dma_map_internal && virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, 
in_sgs)) {
END_USE(vq);
return -EIO;
}
@@ -1561,6 +1568,7 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
vq->packed.desc_state[id].data = data;
vq->packed.desc_state[id].indir_desc = ctx;
vq->packed.desc_state[id].last = prev;
+   vq->packed.desc_state[id].flags = dma_map_internal ? 
VRING_STATE_F_MAP_INTERNAL : 0;
 
/*
 * A driver MUST NOT make the first descriptor in the list
@@ -1632,8 +1640,10 @@ static void detach_buf_packed(struct vring_virtqueue *vq,
struct vring_desc_state_packed *state = NULL;
struct vring_packed_desc *desc;
unsigned int i, curr;
+   bool dma_map_internal;
 
state = &vq->packed.desc_state[id];
+   dma_map_internal = !!(state->flags & VRING_STATE_F_MAP_INTERNAL);
 
/* Clear data ptr. */
state->data = NULL;
@@ -1646,7 +1656,8 @@ static void detach_buf_packed(struct vring_virtqueue *vq,
curr = id;
for (i = 0; i < state->num; i++) {
vring_unmap_extra_packed(vq,
-&vq->packed.desc_extra[curr]);
+&vq->packed.desc_extra[curr],
+dma_map_internal);
curr = vq->packed.desc_extra[curr].next;
}
}
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 06/12] virtio_ring: packed-indirect: support premapped

2023-05-09 Thread Xuan Zhuo
virtio core only supports virtual addresses, dma is completed in virtio
core.

In some scenarios (such as the AF_XDP), the memory is allocated
and DMA mapping is completed in advance, so it is necessary for us to
support passing the DMA address to virtio core.

Drives can use sg->dma_address to pass the mapped dma address to virtio
core. If one sg->dma_address is used then all sgs must use sg->dma_address,
otherwise all dma_address must be null when passing it to the APIs of
virtio.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ccf6fd3c066a..c185354fcac6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1355,6 +1355,7 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
unsigned int i, n;
u16 head, id;
dma_addr_t addr;
+   bool dma_map_internal;
 
head = vq->packed.next_avail_idx;
desc = alloc_indirect_packed(total_sg, gfp);
@@ -1372,7 +1373,8 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
id = vq->free_head;
BUG_ON(id == vq->packed.vring.num);
 
-   if (virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, in_sgs))
+   dma_map_internal = !!sg_page(sgs[0]);
+   if (dma_map_internal && virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, 
in_sgs))
goto err_map;
 
for (n = 0; n < out_sgs + in_sgs; n++) {
@@ -1434,6 +1436,8 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
vq->packed.desc_state[id].data = data;
vq->packed.desc_state[id].indir_desc = desc;
vq->packed.desc_state[id].last = id;
+   vq->packed.desc_state[id].flags = dma_map_internal ? 
VRING_STATE_F_MAP_INTERNAL : 0;
+
 
vq->num_added += 1;
 
@@ -1443,7 +1447,8 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
return 0;
 
 unmap_release:
-   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
+   if (dma_map_internal)
+   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
 
 err_map:
kfree(desc);
@@ -1670,7 +1675,7 @@ static void detach_buf_packed(struct vring_virtqueue *vq,
if (!desc)
return;
 
-   if (vq->use_dma_api) {
+   if (vq->use_dma_api && dma_map_internal) {
len = vq->packed.desc_extra[id].len;
for (i = 0; i < len / sizeof(struct vring_packed_desc);
i++)
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 04/12] virtio_ring: split: support premapped

2023-05-09 Thread Xuan Zhuo
virtio core only supports virtual addresses, dma is completed in virtio
core.

In some scenarios (such as the AF_XDP), the memory is allocated
and DMA mapping is completed in advance, so it is necessary for us to
support passing the DMA address to virtio core.

Drives can use sg->dma_address to pass the mapped dma address to virtio
core. If one sg->dma_address is used then all sgs must use
sg->dma_address, otherwise all must be null when passing it to the APIs
of virtio.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 31 +++
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c4f0eb4e5a28..670b268a10ee 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -67,9 +67,13 @@
 #define LAST_ADD_TIME_INVALID(vq)
 #endif
 
+#define VRING_STATE_F_MAP_INTERNAL BIT(0)
+
 struct vring_desc_state_split {
void *data; /* Data for callback. */
struct vring_desc *indir_desc;  /* Indirect descriptor, if any. */
+   u32 flags;  /* State flags. */
+   u32 padding;
 };
 
 struct vring_desc_state_packed {
@@ -455,7 +459,7 @@ static void vring_unmap_one_split_indirect(const struct 
vring_virtqueue *vq,
 }
 
 static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq,
- unsigned int i)
+ unsigned int i, bool dma_map_internal)
 {
struct vring_desc_extra *extra = vq->split.desc_extra;
u16 flags;
@@ -472,6 +476,9 @@ static unsigned int vring_unmap_one_split(const struct 
vring_virtqueue *vq,
 (flags & VRING_DESC_F_WRITE) ?
 DMA_FROM_DEVICE : DMA_TO_DEVICE);
} else {
+   if (!dma_map_internal)
+   goto out;
+
dma_unmap_page(vring_dma_dev(vq),
   extra[i].addr,
   extra[i].len,
@@ -624,7 +631,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
struct scatterlist *sg;
struct vring_desc *desc;
unsigned int i, n, avail, descs_used, prev;
-   bool indirect;
+   bool indirect, dma_map_internal;
int head;
 
START_USE(vq);
@@ -677,7 +684,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
return -ENOSPC;
}
 
-   if (virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, in_sgs))
+   dma_map_internal = !!sg_page(sgs[0]);
+   if (dma_map_internal && virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, 
in_sgs))
goto err_map;
 
for (n = 0; n < out_sgs; n++) {
@@ -744,6 +752,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
else
vq->split.desc_state[head].indir_desc = ctx;
 
+   vq->split.desc_state[head].flags = dma_map_internal ? 
VRING_STATE_F_MAP_INTERNAL : 0;
+
/* Put entry in available array (but don't update avail->idx until they
 * do sync). */
avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1);
@@ -768,7 +778,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
return 0;
 
 unmap_release:
-   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
+   if (dma_map_internal)
+   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
 
 err_map:
if (indirect)
@@ -814,20 +825,22 @@ static void detach_buf_split(struct vring_virtqueue *vq, 
unsigned int head,
 {
unsigned int i, j;
__virtio16 nextflag = cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT);
+   bool dma_map_internal;
 
/* Clear data ptr. */
vq->split.desc_state[head].data = NULL;
+   dma_map_internal = !!(vq->split.desc_state[head].flags & 
VRING_STATE_F_MAP_INTERNAL);
 
/* Put back on free list: unmap first-level descriptors and find end */
i = head;
 
while (vq->split.vring.desc[i].flags & nextflag) {
-   vring_unmap_one_split(vq, i);
+   vring_unmap_one_split(vq, i, dma_map_internal);
i = vq->split.desc_extra[i].next;
vq->vq.num_free++;
}
 
-   vring_unmap_one_split(vq, i);
+   vring_unmap_one_split(vq, i, dma_map_internal);
vq->split.desc_extra[i].next = vq->free_head;
vq->free_head = head;
 
@@ -849,8 +862,10 @@ static void detach_buf_split(struct vring_virtqueue *vq, 
unsigned int head,
VRING_DESC_F_INDIRECT));
BUG_ON(len == 0 || len % sizeof(struct vring_desc));
 
-   for (j = 0; j < len / sizeof(struct vring_desc); j++)
-   vring_unmap_one_split_indirect(vq, &indir_desc[j]);
+   if (dma_map_internal) {
+   for (j = 0; j < len / sizeof(struct vring_desc); j++)
+   

[PATCH vhost v8 03/12] virtio_ring: packed-indirect: separate dma codes

2023-05-09 Thread Xuan Zhuo
DMA-related logic is separated from the virtqueue_add_indirect_packed().

DMA address will be saved as sg->dma_address, then
virtqueue_add_indirect_packed() will use it directly. Unmap operation
will be simpler.

The purpose of this is to facilitate subsequent support to receive
dma address mapped by drivers.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/virtio/virtio_ring.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index e5fa97c91f03..c4f0eb4e5a28 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1332,7 +1332,7 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
 {
struct vring_packed_desc *desc;
struct scatterlist *sg;
-   unsigned int i, n, err_idx;
+   unsigned int i, n;
u16 head, id;
dma_addr_t addr;
 
@@ -1352,16 +1352,14 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
id = vq->free_head;
BUG_ON(id == vq->packed.vring.num);
 
+   if (virtqueue_map_sgs(vq, sgs, total_sg, out_sgs, in_sgs))
+   goto err_map;
+
for (n = 0; n < out_sgs + in_sgs; n++) {
for (sg = sgs[n]; sg; sg = sg_next(sg)) {
-   addr = vring_map_one_sg(vq, sg, n < out_sgs ?
-   DMA_TO_DEVICE : DMA_FROM_DEVICE);
-   if (vring_mapping_error(vq, addr))
-   goto unmap_release;
-
desc[i].flags = cpu_to_le16(n < out_sgs ?
0 : VRING_DESC_F_WRITE);
-   desc[i].addr = cpu_to_le64(addr);
+   desc[i].addr = cpu_to_le64(vring_sg_address(sg));
desc[i].len = cpu_to_le32(sg->length);
i++;
}
@@ -1425,11 +1423,9 @@ static int virtqueue_add_indirect_packed(struct 
vring_virtqueue *vq,
return 0;
 
 unmap_release:
-   err_idx = i;
-
-   for (i = 0; i < err_idx; i++)
-   vring_unmap_desc_packed(vq, &desc[i]);
+   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
 
+err_map:
kfree(desc);
 
END_USE(vq);
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost v8 01/12] virtio_ring: split: separate dma codes

2023-05-09 Thread Xuan Zhuo
DMA-related logic is separated from the virtqueue_add_split() to
one new function. DMA address will be saved as sg->dma_address if
use_dma_api is true, then virtqueue_add_split() will use it directly.
Unmap operation will be simpler.

The purpose of this is to facilitate subsequent support to receive
dma address mapped by drivers.

NOTE: We set sg_page to null after dma mapping is done. Then only one of
sg page and dma_address is valid. Then we can use sg_page to judge which
one to use. We cannot check dma_address. Since dma_address is
initialized to 0, that is a valid dma address.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 131 +++
 1 file changed, 103 insertions(+), 28 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c5310eaf8b46..f243fea8987b 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -379,6 +379,21 @@ static dma_addr_t vring_map_one_sg(const struct 
vring_virtqueue *vq,
direction);
 }
 
+static dma_addr_t vring_sg_address(struct scatterlist *sg)
+{
+   /*
+* Only one of sg_page() and dma_address is valid. If dma mapping is
+* done, we set sg page to null.
+*
+* We can not check dma_address, dma_address is initialized to 0, but
+* that is a valid dma address.
+*/
+   if (sg_page(sg))
+   return (dma_addr_t)sg_phys(sg);
+
+   return sg->dma_address;
+}
+
 static dma_addr_t vring_map_single(const struct vring_virtqueue *vq,
   void *cpu_addr, size_t size,
   enum dma_data_direction direction)
@@ -520,6 +535,82 @@ static inline unsigned int virtqueue_add_desc_split(struct 
virtqueue *vq,
return next;
 }
 
+static void virtqueue_unmap_sgs(struct vring_virtqueue *vq,
+   struct scatterlist *sgs[],
+   unsigned int total_sg,
+   unsigned int out_sgs,
+   unsigned int in_sgs)
+{
+   struct scatterlist *sg;
+   unsigned int n;
+
+   if (!vq->use_dma_api)
+   return;
+
+   for (n = 0; n < out_sgs; n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   if (sg_page(sg))
+   return;
+
+   dma_unmap_page(vring_dma_dev(vq), sg->dma_address,
+  sg->length, DMA_TO_DEVICE);
+   }
+   }
+
+   for (; n < (out_sgs + in_sgs); n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   if (sg_page(sg))
+   return;
+
+   dma_unmap_page(vring_dma_dev(vq), sg->dma_address,
+  sg->length, DMA_FROM_DEVICE);
+   }
+   }
+}
+
+static int virtqueue_map_sgs(struct vring_virtqueue *vq,
+struct scatterlist *sgs[],
+unsigned int total_sg,
+unsigned int out_sgs,
+unsigned int in_sgs)
+{
+   struct scatterlist *sg;
+   unsigned int n;
+
+   if (!vq->use_dma_api)
+   return 0;
+
+   for (n = 0; n < out_sgs; n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   dma_addr_t addr = vring_map_one_sg(vq, sg, 
DMA_TO_DEVICE);
+
+   if (vring_mapping_error(vq, addr))
+   goto err;
+
+   sg->dma_address = addr;
+   sg_assign_page(sg, NULL);
+   }
+   }
+
+   for (; n < (out_sgs + in_sgs); n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   dma_addr_t addr = vring_map_one_sg(vq, sg, 
DMA_FROM_DEVICE);
+
+   if (vring_mapping_error(vq, addr))
+   goto err;
+
+   sg->dma_address = addr;
+   sg_assign_page(sg, NULL);
+   }
+   }
+
+   return 0;
+
+err:
+   virtqueue_unmap_sgs(vq, sgs, total_sg, out_sgs, in_sgs);
+   return -ENOMEM;
+}
+
 static inline int virtqueue_add_split(struct virtqueue *_vq,
  struct scatterlist *sgs[],
  unsigned int total_sg,
@@ -532,9 +623,9 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
struct vring_virtqueue *vq = to_vvq(_vq);
struct scatterlist *sg;
struct vring_desc *desc;
-   unsigned int i, n, avail, descs_used, prev, err_idx;
-   int head;
+   unsigned int i, n, avail, descs_used, prev;
bool indirect;
+   int head;
 
START_USE(vq);
 
@@ -586,32 +677,30 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
return -ENOSPC;