Re: [PATCH net v4] virtio_net: Fix error unwinding of XDP initialization

2023-05-08 Thread Michael S. Tsirkin
On Mon, May 08, 2023 at 06:27:08PM -0400, Feng Liu wrote:
> When initializing XDP in virtnet_open(), some rq xdp initialization
> may hit an error causing net device open failed. However, previous
> rqs have already initialized XDP and enabled NAPI, which is not the
> expected behavior. Need to roll back the previous rq initialization
> to avoid leaks in error unwinding of init code.
> 
> Also extract helper functions of disable and enable queue pairs.
> Use newly introduced disable helper function in error unwinding and
> virtnet_close. Use enable helper function in virtnet_open.
> 
> Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
> Signed-off-by: Feng Liu 
> Reviewed-by: Jiri Pirko 
> ---
> 
> v3 -> v4
> feedbacks from Jiri Pirko
> - Add symmetric helper function virtnet_enable_qp to enable queues.
> - Error handle:  cleanup current queue pair in virtnet_enable_qp,
>   and complete the reset queue pairs cleanup in virtnet_open.
> - Fix coding style.
> feedbacks from Parav Pandit
> - Remove redundant debug message and white space.
> 
> v2 -> v3
> feedbacks from Michael S. Tsirkin
> - Remove redundant comment.
> 
> v1 -> v2
> feedbacks from Michael S. Tsirkin
> - squash two patches together.
> 
> ---
>  drivers/net/virtio_net.c | 58 
>  1 file changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 8d8038538fc4..df7c08048fa7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1868,6 +1868,38 @@ static int virtnet_poll(struct napi_struct *napi, int 
> budget)
>   return received;
>  }
>  
> +static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)


I am guessing _qp stands for queue pair? Let's call it
virtnet_disable_queue_pair please, consistently with max_queue_pairs.

> +{
> + virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
> + napi_disable(&vi->rq[qp_index].napi);
> + xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
> +}
> +
> +static int virtnet_enable_qp(struct virtnet_info *vi, int qp_index)

Similarly, virtnet_enable_queue_pair

> +{
> + struct net_device *dev = vi->dev;
> + int err;
> +
> + err = xdp_rxq_info_reg(&vi->rq[qp_index].xdp_rxq, dev, qp_index,
> +vi->rq[qp_index].napi.napi_id);
> + if (err < 0)
> + return err;
> +
> + err = xdp_rxq_info_reg_mem_model(&vi->rq[qp_index].xdp_rxq,
> +  MEM_TYPE_PAGE_SHARED, NULL);
> + if (err < 0)
> + goto err_xdp_reg_mem_model;
> +
> + virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
> + virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
> +
> + return 0;
> +
> +err_xdp_reg_mem_model:
> + xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
> + return err;
> +}
> +
>  static int virtnet_open(struct net_device *dev)
>  {
>   struct virtnet_info *vi = netdev_priv(dev);
> @@ -1881,22 +1913,17 @@ static int virtnet_open(struct net_device *dev)
>   if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
>   schedule_delayed_work(&vi->refill, 0);
>  
> - err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
> vi->rq[i].napi.napi_id);
> + err = virtnet_enable_qp(vi, i);
>   if (err < 0)
> - return err;
> -
> - err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
> -  MEM_TYPE_PAGE_SHARED, NULL);
> - if (err < 0) {
> - xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
> - return err;
> - }
> -
> - virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
> - virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
> + goto err_enable_qp;
>   }
>  
>   return 0;
> +
> +err_enable_qp:
> + for (i--; i >= 0; i--)
> + virtnet_disable_qp(vi, i);
> + return err;
>  }
>  
>  static int virtnet_poll_tx(struct napi_struct *napi, int budget)
> @@ -2305,11 +2332,8 @@ static int virtnet_close(struct net_device *dev)
>   /* Make sure refill_work doesn't re-enable napi! */
>   cancel_delayed_work_sync(&vi->refill);
>  
> - for (i = 0; i < vi->max_queue_pairs; i++) {
> - virtnet_napi_tx_disable(&vi->sq[i].napi);
> - napi_disable(&vi->rq[i].napi);
> - xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
> - }
> + for (i = 0; i < vi->max_queue_pairs; i++)
> + virtnet_disable_qp(vi, i);
>  
>   return 0;
>  }
> -- 
> 2.37.1 (Apple Git-137.1)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4] virtio_net: suppress cpu stall when free_unused_bufs

2023-05-08 Thread Jason Wang
On Mon, May 8, 2023 at 2:47 PM Michael S. Tsirkin  wrote:
>
> On Mon, May 08, 2023 at 02:13:42PM +0800, Jason Wang wrote:
> > On Mon, May 8, 2023 at 2:08 PM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, May 08, 2023 at 11:12:03AM +0800, Jason Wang wrote:
> > > >
> > > > 在 2023/5/7 21:34, Michael S. Tsirkin 写道:
> > > > > On Fri, May 05, 2023 at 11:28:25AM +0800, Jason Wang wrote:
> > > > > > On Thu, May 4, 2023 at 10:27 AM Wenliang Wang
> > > > > >  wrote:
> > > > > > > For multi-queue and large ring-size use case, the following error
> > > > > > > occurred when free_unused_bufs:
> > > > > > > rcu: INFO: rcu_sched self-detected stall on CPU.
> > > > > > >
> > > > > > > Fixes: 986a4f4d452d ("virtio_net: multiqueue support")
> > > > > > > Signed-off-by: Wenliang Wang 
> > > > > > > ---
> > > > > > > v2:
> > > > > > > -add need_resched check.
> > > > > > > -apply same logic to sq.
> > > > > > > v3:
> > > > > > > -use cond_resched instead.
> > > > > > > v4:
> > > > > > > -add fixes tag
> > > > > > > ---
> > > > > > >   drivers/net/virtio_net.c | 2 ++
> > > > > > >   1 file changed, 2 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > > index 8d8038538fc4..a12ae26db0e2 100644
> > > > > > > --- a/drivers/net/virtio_net.c
> > > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > > @@ -3560,12 +3560,14 @@ static void free_unused_bufs(struct 
> > > > > > > virtnet_info *vi)
> > > > > > >  struct virtqueue *vq = vi->sq[i].vq;
> > > > > > >  while ((buf = virtqueue_detach_unused_buf(vq)) 
> > > > > > > != NULL)
> > > > > > >  virtnet_sq_free_unused_buf(vq, buf);
> > > > > > > +   cond_resched();
> > > > > > Does this really address the case when the virtqueue is very large?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > it does in that a very large queue is still just 64k in size.
> > > > > we might however have 64k of these queues.
> > > >
> > > >
> > > > Ok, but we have other similar loops especially the refill, I think we 
> > > > may
> > > > need cond_resched() there as well.
> > > >
> > > > Thanks
> > > >
> > >
> > > Refill is already per vq isn't it?
> >
> > Not for the refill_work().
> >
> > Thanks
>
> Good point, refill_work probably needs cond_resched, too.
> And I guess virtnet_open?

Yes, let me draft a patch.

Thanks

>
>
> > >
> > >
> > > > >
> > > > > > >  }
> > > > > > >
> > > > > > >  for (i = 0; i < vi->max_queue_pairs; i++) {
> > > > > > >  struct virtqueue *vq = vi->rq[i].vq;
> > > > > > >  while ((buf = virtqueue_detach_unused_buf(vq)) 
> > > > > > > != NULL)
> > > > > > >  virtnet_rq_free_unused_buf(vq, buf);
> > > > > > > +   cond_resched();
> > > > > > >  }
> > > > > > >   }
> > > > > > >
> > > > > > > --
> > > > > > > 2.20.1
> > > > > > >
> > >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_net: set default mtu to 1500 when 'Device maximum MTU' bigger than 1500

2023-05-08 Thread Xuan Zhuo
On Mon, 8 May 2023 14:10:07 -0400, "Michael S. Tsirkin"  wrote:
> On Mon, May 08, 2023 at 09:25:48AM -0700, Stephen Hemminger wrote:
> > On Mon, 8 May 2023 06:30:07 -0400
> > "Michael S. Tsirkin"  wrote:
> >
> > > > > > I don't know, in any scenario, when the hardware supports a large 
> > > > > > mtu, but we do
> > > > > > not want the user to use it by default.
> > > > >
> > > > > When other devices on the same LAN have mtu set to 1500 and
> > > > > won't accept bigger packets.
> > > >
> > > > So, that depends on pmtu/tcp-probe-mtu.
> > > >
> > > > If the os without pmtu/tcp-probe-mtu has a bigger mtu, then it's big 
> > > > packet
> > > > will lost.
> > > >
> > > > Thanks.
> > > >
> > >
> > > pmtu is designed for routing. LAN is supposed to be configured with
> > > a consistent MTU.
> >
> > Virtio is often used with bridging or macvlan which can't support PMTU.
> > PMTU only works when forwarding at layer 3 (ie routing) where there is
> > a IP address to send the ICMP response. If doing L2 forwarding, the
> > only thin the bridge can do is drop the packet.
> >
> > TCP cab recover but detecting an MTU blackhole requires retransmissions.
>
> Exactly. That's why we basically use the MTU advice supplied by device
> by default - it's designed for use-cases of software devices where
> the device might have more information about the MTU than the guest.
> If hardware devices want e.g. a way to communicate support for
> jumbo frames without communicating any information about the LAN,
> a new feature will be needed.


Let's think this question carefully. If necessary, we will try to introduce a
new feature for virtio-net spec to support Jumbo Frame.

Thanks.


>
> --
> MST
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization

2023-05-08 Thread Xuan Zhuo
On Mon, 8 May 2023 11:00:10 -0400, Feng Liu  wrote:
>
>
> On 2023-05-07 p.m.9:45, Xuan Zhuo wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Sat, 6 May 2023 08:08:02 -0400, Feng Liu  wrote:
> >>
> >>
> >> On 2023-05-05 p.m.10:33, Xuan Zhuo wrote:
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> On Tue, 2 May 2023 20:35:25 -0400, Feng Liu  wrote:
>  When initializing XDP in virtnet_open(), some rq xdp initialization
>  may hit an error causing net device open failed. However, previous
>  rqs have already initialized XDP and enabled NAPI, which is not the
>  expected behavior. Need to roll back the previous rq initialization
>  to avoid leaks in error unwinding of init code.
> 
>  Also extract a helper function of disable queue pairs, and use newly
>  introduced helper function in error unwinding and virtnet_close;
> 
>  Issue: 3383038
>  Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
>  Signed-off-by: Feng Liu 
>  Reviewed-by: William Tu 
>  Reviewed-by: Parav Pandit 
>  Reviewed-by: Simon Horman 
>  Acked-by: Michael S. Tsirkin 
>  Change-Id: Ib4c6a97cb7b837cfa484c593dd43a435c47ea68f
>  ---
> drivers/net/virtio_net.c | 30 --
> 1 file changed, 20 insertions(+), 10 deletions(-)
> 
>  diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>  index 8d8038538fc4..3737cf120cb7 100644
>  --- a/drivers/net/virtio_net.c
>  +++ b/drivers/net/virtio_net.c
>  @@ -1868,6 +1868,13 @@ static int virtnet_poll(struct napi_struct *napi, 
>  int budget)
>  return received;
> }
> 
>  +static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)
>  +{
>  + virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
>  + napi_disable(&vi->rq[qp_index].napi);
>  + xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
>  +}
>  +
> static int virtnet_open(struct net_device *dev)
> {
>  struct virtnet_info *vi = netdev_priv(dev);
>  @@ -1883,20 +1890,26 @@ static int virtnet_open(struct net_device *dev)
> 
>  err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
>  vi->rq[i].napi.napi_id);
>  if (err < 0)
>  - return err;
>  + goto err_xdp_info_reg;
> 
>  err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
>   MEM_TYPE_PAGE_SHARED, 
>  NULL);
>  - if (err < 0) {
>  - xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>  - return err;
>  - }
>  + if (err < 0)
>  + goto err_xdp_reg_mem_model;
> 
>  virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
>  virtnet_napi_tx_enable(vi, vi->sq[i].vq, 
>  &vi->sq[i].napi);
>  }
> 
>  return 0;
>  +
>  +err_xdp_reg_mem_model:
>  + xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>  +err_xdp_info_reg:
>  + for (i = i - 1; i >= 0; i--)
>  + virtnet_disable_qp(vi, i);
> >>>
> >>>
> >>> I would to know should we handle for these:
> >>>
> >>>   disable_delayed_refill(vi);
> >>>   cancel_delayed_work_sync(&vi->refill);
> >>>
> >>>
> >>> Maybe we should call virtnet_close() with "i" directly.
> >>>
> >>> Thanks.
> >>>
> >>>
> >> Can’t use i directly here, because if xdp_rxq_info_reg fails, napi has
> >> not been enabled for current qp yet, I should roll back from the queue
> >> pairs where napi was enabled before(i--), otherwise it will hang at napi
> >> disable api
> >
> > This is not the point, the key is whether we should handle with:
> >
> >disable_delayed_refill(vi);
> >cancel_delayed_work_sync(&vi->refill);
> >
> > Thanks.
> >
> >
>
> OK, get the point. Thanks for your careful review. And I check the code
> again.
>
> There are two points that I need to explain:
>
> 1. All refill delay work calls(vi->refill, vi->refill_enabled) are based
> on that the virtio interface is successfully opened, such as
> virtnet_receive, virtnet_rx_resize, _virtnet_set_queues, etc. If there
> is an error in the xdp reg here, it will not trigger these subsequent
> functions. There is no need to call disable_delayed_refill() and
> cancel_delayed_work_sync().

Maybe something is wrong. I think these lines may call delay work.

static int virtnet_open(struct net_device *dev)
{
struct virtnet_info *vi = netdev_priv(dev);
int i, err;

enable_delayed_refill(vi);

for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
/* Make sure we have some buffers: if oom use wq. */
--> if 

[PATCH net v4] virtio_net: Fix error unwinding of XDP initialization

2023-05-08 Thread Feng Liu via Virtualization
When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract helper functions of disable and enable queue pairs.
Use newly introduced disable helper function in error unwinding and
virtnet_close. Use enable helper function in virtnet_open.

Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu 
Reviewed-by: Jiri Pirko 
---

v3 -> v4
feedbacks from Jiri Pirko
- Add symmetric helper function virtnet_enable_qp to enable queues.
- Error handle:  cleanup current queue pair in virtnet_enable_qp,
  and complete the reset queue pairs cleanup in virtnet_open.
- Fix coding style.
feedbacks from Parav Pandit
- Remove redundant debug message and white space.

v2 -> v3
feedbacks from Michael S. Tsirkin
- Remove redundant comment.

v1 -> v2
feedbacks from Michael S. Tsirkin
- squash two patches together.

---
 drivers/net/virtio_net.c | 58 
 1 file changed, 41 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8d8038538fc4..df7c08048fa7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1868,6 +1868,38 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
return received;
 }
 
+static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)
+{
+   virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
+   napi_disable(&vi->rq[qp_index].napi);
+   xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+}
+
+static int virtnet_enable_qp(struct virtnet_info *vi, int qp_index)
+{
+   struct net_device *dev = vi->dev;
+   int err;
+
+   err = xdp_rxq_info_reg(&vi->rq[qp_index].xdp_rxq, dev, qp_index,
+  vi->rq[qp_index].napi.napi_id);
+   if (err < 0)
+   return err;
+
+   err = xdp_rxq_info_reg_mem_model(&vi->rq[qp_index].xdp_rxq,
+MEM_TYPE_PAGE_SHARED, NULL);
+   if (err < 0)
+   goto err_xdp_reg_mem_model;
+
+   virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
+   virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
+
+   return 0;
+
+err_xdp_reg_mem_model:
+   xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+   return err;
+}
+
 static int virtnet_open(struct net_device *dev)
 {
struct virtnet_info *vi = netdev_priv(dev);
@@ -1881,22 +1913,17 @@ static int virtnet_open(struct net_device *dev)
if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
schedule_delayed_work(&vi->refill, 0);
 
-   err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
+   err = virtnet_enable_qp(vi, i);
if (err < 0)
-   return err;
-
-   err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
-MEM_TYPE_PAGE_SHARED, NULL);
-   if (err < 0) {
-   xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
-   return err;
-   }
-
-   virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
-   virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
+   goto err_enable_qp;
}
 
return 0;
+
+err_enable_qp:
+   for (i--; i >= 0; i--)
+   virtnet_disable_qp(vi, i);
+   return err;
 }
 
 static int virtnet_poll_tx(struct napi_struct *napi, int budget)
@@ -2305,11 +2332,8 @@ static int virtnet_close(struct net_device *dev)
/* Make sure refill_work doesn't re-enable napi! */
cancel_delayed_work_sync(&vi->refill);
 
-   for (i = 0; i < vi->max_queue_pairs; i++) {
-   virtnet_napi_tx_disable(&vi->sq[i].napi);
-   napi_disable(&vi->rq[i].napi);
-   xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
-   }
+   for (i = 0; i < vi->max_queue_pairs; i++)
+   virtnet_disable_qp(vi, i);
 
return 0;
 }
-- 
2.37.1 (Apple Git-137.1)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: set default mtu to 1500 when 'Device maximum MTU' bigger than 1500

2023-05-08 Thread Michael S. Tsirkin
On Mon, May 08, 2023 at 09:25:48AM -0700, Stephen Hemminger wrote:
> On Mon, 8 May 2023 06:30:07 -0400
> "Michael S. Tsirkin"  wrote:
> 
> > > > > I don't know, in any scenario, when the hardware supports a large 
> > > > > mtu, but we do
> > > > > not want the user to use it by default.  
> > > >
> > > > When other devices on the same LAN have mtu set to 1500 and
> > > > won't accept bigger packets.  
> > > 
> > > So, that depends on pmtu/tcp-probe-mtu.
> > > 
> > > If the os without pmtu/tcp-probe-mtu has a bigger mtu, then it's big 
> > > packet
> > > will lost.
> > > 
> > > Thanks.
> > >   
> > 
> > pmtu is designed for routing. LAN is supposed to be configured with
> > a consistent MTU.
> 
> Virtio is often used with bridging or macvlan which can't support PMTU.
> PMTU only works when forwarding at layer 3 (ie routing) where there is
> a IP address to send the ICMP response. If doing L2 forwarding, the
> only thin the bridge can do is drop the packet.
> 
> TCP cab recover but detecting an MTU blackhole requires retransmissions.

Exactly. That's why we basically use the MTU advice supplied by device
by default - it's designed for use-cases of software devices where
the device might have more information about the MTU than the guest.
If hardware devices want e.g. a way to communicate support for
jumbo frames without communicating any information about the LAN,
a new feature will be needed.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization

2023-05-08 Thread Feng Liu via Virtualization



On 2023-05-07 p.m.9:45, Xuan Zhuo wrote:

External email: Use caution opening links or attachments


On Sat, 6 May 2023 08:08:02 -0400, Feng Liu  wrote:



On 2023-05-05 p.m.10:33, Xuan Zhuo wrote:

External email: Use caution opening links or attachments


On Tue, 2 May 2023 20:35:25 -0400, Feng Liu  wrote:

When initializing XDP in virtnet_open(), some rq xdp initialization
may hit an error causing net device open failed. However, previous
rqs have already initialized XDP and enabled NAPI, which is not the
expected behavior. Need to roll back the previous rq initialization
to avoid leaks in error unwinding of init code.

Also extract a helper function of disable queue pairs, and use newly
introduced helper function in error unwinding and virtnet_close;

Issue: 3383038
Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
Signed-off-by: Feng Liu 
Reviewed-by: William Tu 
Reviewed-by: Parav Pandit 
Reviewed-by: Simon Horman 
Acked-by: Michael S. Tsirkin 
Change-Id: Ib4c6a97cb7b837cfa484c593dd43a435c47ea68f
---
   drivers/net/virtio_net.c | 30 --
   1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8d8038538fc4..3737cf120cb7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1868,6 +1868,13 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
return received;
   }

+static void virtnet_disable_qp(struct virtnet_info *vi, int qp_index)
+{
+ virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
+ napi_disable(&vi->rq[qp_index].napi);
+ xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
+}
+
   static int virtnet_open(struct net_device *dev)
   {
struct virtnet_info *vi = netdev_priv(dev);
@@ -1883,20 +1890,26 @@ static int virtnet_open(struct net_device *dev)

err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
vi->rq[i].napi.napi_id);
if (err < 0)
- return err;
+ goto err_xdp_info_reg;

err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
 MEM_TYPE_PAGE_SHARED, NULL);
- if (err < 0) {
- xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
- return err;
- }
+ if (err < 0)
+ goto err_xdp_reg_mem_model;

virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
}

return 0;
+
+err_xdp_reg_mem_model:
+ xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
+err_xdp_info_reg:
+ for (i = i - 1; i >= 0; i--)
+ virtnet_disable_qp(vi, i);



I would to know should we handle for these:

  disable_delayed_refill(vi);
  cancel_delayed_work_sync(&vi->refill);


Maybe we should call virtnet_close() with "i" directly.

Thanks.



Can’t use i directly here, because if xdp_rxq_info_reg fails, napi has
not been enabled for current qp yet, I should roll back from the queue
pairs where napi was enabled before(i--), otherwise it will hang at napi
disable api


This is not the point, the key is whether we should handle with:

   disable_delayed_refill(vi);
   cancel_delayed_work_sync(&vi->refill);

Thanks.




OK, get the point. Thanks for your careful review. And I check the code 
again.


There are two points that I need to explain:

1. All refill delay work calls(vi->refill, vi->refill_enabled) are based 
on that the virtio interface is successfully opened, such as 
virtnet_receive, virtnet_rx_resize, _virtnet_set_queues, etc. If there 
is an error in the xdp reg here, it will not trigger these subsequent 
functions. There is no need to call disable_delayed_refill() and 
cancel_delayed_work_sync(). The logic here is different from that of 
virtnet_close. virtnet_close is based on the success of virtnet_open and 
the tx and rx has been carried out normally. For error unwinding, only 
disable qp is needed. Also encapuslated a helper function of disable qp, 
which is used ing error unwinding and virtnet close
2. The current error qp, which has not enabled NAPI, can only call xdp 
unreg, and cannot call the interface of disable NAPI, otherwise the 
kernel will be stuck. So for i-- the reason for calling disable qp on 
the previous queue


Thanks




+
+ return err;
   }

   static int virtnet_poll_tx(struct napi_struct *napi, int budget)
@@ -2305,11 +2318,8 @@ static int virtnet_close(struct net_device *dev)
/* Make sure refill_work doesn't re-enable napi! */
cancel_delayed_work_sync(&vi->refill);

- for (i = 0; i < vi->max_queue_pairs; i++) {
- virtnet_napi_tx_disable(&vi->sq[i].napi);
- napi_disable(&vi->rq[i].napi);
- xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
- }
+ for (i = 0; i < vi->max_queue_pairs; i++)
+ virtnet_disable_qp(vi, i);

re

Re: [PATCH v6 0/3] Add sync object UAPI support to VirtIO-GPU driver

2023-05-08 Thread Rob Clark
On Wed, May 3, 2023 at 10:07 AM Gurchetan Singh
 wrote:
>
>
>
> On Mon, May 1, 2023 at 8:38 AM Dmitry Osipenko 
>  wrote:
>>
>> On 4/16/23 14:52, Dmitry Osipenko wrote:
>> > We have multiple Vulkan context types that are awaiting for the addition
>> > of the sync object DRM UAPI support to the VirtIO-GPU kernel driver:
>> >
>> >  1. Venus context
>> >  2. Native contexts (virtio-freedreno, virtio-intel, virtio-amdgpu)
>> >
>> > Mesa core supports DRM sync object UAPI, providing Vulkan drivers with a
>> > generic fencing implementation that we want to utilize.
>> >
>> > This patch adds initial sync objects support. It creates fundament for a
>> > further fencing improvements. Later on we will want to extend the 
>> > VirtIO-GPU
>> > fencing API with passing fence IDs to host for waiting, it will be a new
>> > additional VirtIO-GPU IOCTL and more. Today we have several VirtIO-GPU 
>> > context
>> > drivers in works that require VirtIO-GPU to support sync objects UAPI.
>> >
>> > The patch is heavily inspired by the sync object UAPI implementation of the
>> > MSM driver.
>>
>> Gerd, do you have any objections to merging this series?
>>
>> We have AMDGPU [1] and Intel [2] native context WIP drivers depending on
>> the sync object support. It is the only part missing from kernel today
>> that is wanted by the native context drivers. Otherwise, there are few
>> other things in Qemu and virglrenderer left to sort out.
>>
>> [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658
>> [2] https://gitlab.freedesktop.org/digetx/mesa/-/commits/native-context-iris
>
>
> I'm not saying this change isn't good, just it's probably possible to 
> implement the native contexts (even up to even VK1.2) without it.  But this 
> patch series may be the most ergonomic way to do it, given how Mesa is 
> designed.  But you probably want one of Mesa MRs reviewed first before 
> merging (I added a comment on the amdgpu change) and that is a requirement 
> [a].
>
> [a] "The userspace side must be fully reviewed and tested to the standards of 
> that user space project. For e.g. mesa this means piglit testcases and review 
> on the mailing list. This is again to ensure that the new interface actually 
> gets the job done." -- from the requirements
>

tbh, the syncobj support is all drm core, the only driver specifics is
the ioctl parsing.  IMHO existing tests and the two existing consumers
are sufficient.  (Also, considering that additional non-drm
dependencies involved.)

If this was for the core drm syncobj implementation, and not just
driver ioctl parsing and wiring up the core helpers, I would agree
with you.

BR,
-R
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_net: set default mtu to 1500 when 'Device maximum MTU' bigger than 1500

2023-05-08 Thread Michael S. Tsirkin
On Mon, May 08, 2023 at 03:41:56PM +0800, Xuan Zhuo wrote:
> On Mon, 8 May 2023 02:43:24 -0400, "Michael S. Tsirkin"  
> wrote:
> > On Mon, May 08, 2023 at 02:18:08PM +0800, Xuan Zhuo wrote:
> > > On Mon, 8 May 2023 02:15:46 -0400, "Michael S. Tsirkin"  
> > > wrote:
> > > > On Mon, May 08, 2023 at 10:01:59AM +0800, Xuan Zhuo wrote:
> > > > > On Sun, 7 May 2023 04:58:58 -0400, "Michael S. Tsirkin" 
> > > > >  wrote:
> > > > > > On Sat, May 06, 2023 at 04:56:35PM +0800, Hao Chen wrote:
> > > > > > >
> > > > > > >
> > > > > > > 在 2023/5/6 10:50, Xuan Zhuo 写道:
> > > > > > > > On Sat,  6 May 2023 10:15:29 +0800, Hao Chen  
> > > > > > > > wrote:
> > > > > > > > > When VIRTIO_NET_F_MTU(3) Device maximum MTU reporting is 
> > > > > > > > > supported.
> > > > > > > > > If offered by the device, device advises driver about the 
> > > > > > > > > value of its
> > > > > > > > > maximum MTU. If negotiated, the driver uses mtu as the maximum
> > > > > > > > > MTU value. But there the driver also uses it as default mtu,
> > > > > > > > > some devices may have a maximum MTU greater than 1500, this 
> > > > > > > > > may
> > > > > > > > > cause some large packages to be discarded,
> > > > > > > >
> > > > > > > > You mean tx packet?
> > > > > > > Yes.
> > > > > > > >
> > > > > > > > If yes, I do not think this is the problem of driver.
> > > > > > > >
> > > > > > > > Maybe you should give more details about the discard.
> > > > > > > >
> > > > > > > In the current code, if the maximum MTU supported by the virtio 
> > > > > > > net hardware
> > > > > > > is 9000, the default MTU of the virtio net driver will also be 
> > > > > > > set to 9000.
> > > > > > > When sending packets through "ping -s 5000", if the peer router 
> > > > > > > does not
> > > > > > > support negotiating a path MTU through ICMP packets, the packets 
> > > > > > > will be
> > > > > > > discarded. If the peer router supports negotiating path mtu 
> > > > > > > through ICMP
> > > > > > > packets, the host side will perform packet sharding processing 
> > > > > > > based on the
> > > > > > > negotiated path mtu, which is generally within 1500.
> > > > > > > This is not a bugfix patch, I think setting the default mtu to 
> > > > > > > within 1500
> > > > > > > would be more suitable here.Thanks.
> > > > > >
> > > > > > I don't think VIRTIO_NET_F_MTU is appropriate for support for jumbo 
> > > > > > packets.
> > > > > > The spec says:
> > > > > > The device MUST forward transmitted packets of up to mtu (plus 
> > > > > > low level ethernet header length) size with
> > > > > > gso_type NONE or ECN, and do so without fragmentation, after 
> > > > > > VIRTIO_NET_F_MTU has been success-
> > > > > > fully negotiated.
> > > > > > VIRTIO_NET_F_MTU has been designed for all kind of tunneling 
> > > > > > devices,
> > > > > > and this is why we set mtu to max by default.
> > > > > >
> > > > > > For things like jumbo frames where MTU might or might not be 
> > > > > > available,
> > > > > > a new feature would be more appropriate.
> > > > >
> > > > >
> > > > > So for jumbo frame, what is the problem?
> > > > >
> > > > > We are trying to do this. @Heng
> > > > >
> > > > > Thanks.
> > > >
> > > > It is not a problem as such. But VIRTIO_NET_F_MTU will set the
> > > > default MTU not just the maximum one, because spec seems to
> > > > say it can.
> > >
> > > I see.
> > >
> > > In the case of Jumbo Frame, we also hope that the driver will set the 
> > > default
> > > directly to the max mtu. Just like what you said "Bigger packets = better
> > > performance."
> > >
> > > I don't know, in any scenario, when the hardware supports a large mtu, 
> > > but we do
> > > not want the user to use it by default.
> >
> > When other devices on the same LAN have mtu set to 1500 and
> > won't accept bigger packets.
> 
> So, that depends on pmtu/tcp-probe-mtu.
> 
> If the os without pmtu/tcp-probe-mtu has a bigger mtu, then it's big packet
> will lost.
> 
> Thanks.
> 

pmtu is designed for routing. LAN is supposed to be configured with
a consistent MTU.

> >
> > > Of course, the scene that this patch
> > > wants to handle does exist, but I have never thought that this is a 
> > > problem at
> > > the driver level.
> > >
> > > Thanks.
> > >
> > >
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > > > > so I changed the MTU to a more
> > > > > > > > > general 1500 when 'Device maximum MTU' bigger than 1500.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Hao Chen 
> > > > > > > > > ---
> > > > > > > > >   drivers/net/virtio_net.c | 5 -
> > > > > > > > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/net/virtio_net.c 
> > > > > > > > > b/drivers/net/virtio_net.c
> > > > > > > > > index 8d8038538fc4..e71c7d1b5f29 100644
> > > > > > > > > --- a/drivers/net/virtio_net.c
> > > > > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > > > > @@ -4040,7 +4040,10 @@ static int virtnet_probe(struct 
> > > > > > > > > virtio_

[PATCH V2 3/5] vDPA/ifcvf: retire ifcvf_start_datapath and ifcvf_add_status

2023-05-08 Thread Zhu Lingshan
Rather than former lazy-initialization mechanism,
now the virtqueue operations and driver_features related
ops access the virtio registers directly to take
immediate actions. So ifcvf_start_datapath() should
retire.

ifcvf_add_status() is retierd because we should not change
device status by a vendor driver's decision, this driver should
only set device status which is from virito drivers
upon vdpa_ops.set_status()

Signed-off-by: Zhu Lingshan 
Acked-by: Jason Wang 
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 19 ---
 drivers/vdpa/ifcvf/ifcvf_base.h |  1 -
 drivers/vdpa/ifcvf/ifcvf_main.c | 23 ---
 3 files changed, 43 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 546e923bcd16..79e313c5e10e 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -178,15 +178,6 @@ void ifcvf_reset(struct ifcvf_hw *hw)
ifcvf_get_status(hw);
 }
 
-static void ifcvf_add_status(struct ifcvf_hw *hw, u8 status)
-{
-   if (status != 0)
-   status |= ifcvf_get_status(hw);
-
-   ifcvf_set_status(hw, status);
-   ifcvf_get_status(hw);
-}
-
 u64 ifcvf_get_hw_features(struct ifcvf_hw *hw)
 {
struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
@@ -387,16 +378,6 @@ static void ifcvf_hw_disable(struct ifcvf_hw *hw)
}
 }
 
-int ifcvf_start_hw(struct ifcvf_hw *hw)
-{
-   ifcvf_add_status(hw, VIRTIO_CONFIG_S_ACKNOWLEDGE);
-   ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER);
-
-   ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER_OK);
-
-   return 0;
-}
-
 void ifcvf_stop_hw(struct ifcvf_hw *hw)
 {
ifcvf_hw_disable(hw);
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index cb19196c3ece..d34d3bc0dbf4 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -110,7 +110,6 @@ struct ifcvf_vdpa_mgmt_dev {
 };
 
 int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev);
-int ifcvf_start_hw(struct ifcvf_hw *hw);
 void ifcvf_stop_hw(struct ifcvf_hw *hw);
 void ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
 void ifcvf_read_dev_config(struct ifcvf_hw *hw, u64 offset,
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 4588484bd53d..968687159e44 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -346,22 +346,6 @@ static int ifcvf_request_irq(struct ifcvf_hw *vf)
return 0;
 }
 
-static int ifcvf_start_datapath(struct ifcvf_adapter *adapter)
-{
-   struct ifcvf_hw *vf = adapter->vf;
-   u8 status;
-   int ret;
-
-   ret = ifcvf_start_hw(vf);
-   if (ret < 0) {
-   status = ifcvf_get_status(vf);
-   status |= VIRTIO_CONFIG_S_FAILED;
-   ifcvf_set_status(vf, status);
-   }
-
-   return ret;
-}
-
 static int ifcvf_stop_datapath(struct ifcvf_adapter *adapter)
 {
struct ifcvf_hw *vf = adapter->vf;
@@ -452,13 +436,11 @@ static u8 ifcvf_vdpa_get_status(struct vdpa_device 
*vdpa_dev)
 
 static void ifcvf_vdpa_set_status(struct vdpa_device *vdpa_dev, u8 status)
 {
-   struct ifcvf_adapter *adapter;
struct ifcvf_hw *vf;
u8 status_old;
int ret;
 
vf  = vdpa_to_vf(vdpa_dev);
-   adapter = vdpa_to_adapter(vdpa_dev);
status_old = ifcvf_get_status(vf);
 
if (status_old == status)
@@ -473,11 +455,6 @@ static void ifcvf_vdpa_set_status(struct vdpa_device 
*vdpa_dev, u8 status)
ifcvf_set_status(vf, status);
return;
}
-
-   if (ifcvf_start_datapath(adapter) < 0)
-   IFCVF_ERR(adapter->pdev,
- "Failed to set ifcvf vdpa  status %u\n",
- status);
}
 
ifcvf_set_status(vf, status);
-- 
2.39.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 4/5] vDPA/ifcvf: synchronize irqs in the reset routine

2023-05-08 Thread Zhu Lingshan
This commit synchronize irqs of the virtqueues
and config space in the reset routine.
Thus ifcvf_stop_hw() and reset() are refactored as well.

Signed-off-by: Zhu Lingshan 
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 41 +
 drivers/vdpa/ifcvf/ifcvf_base.h |  1 +
 drivers/vdpa/ifcvf/ifcvf_main.c | 46 +
 3 files changed, 38 insertions(+), 50 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 79e313c5e10e..1f39290baa38 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -170,12 +170,9 @@ void ifcvf_set_status(struct ifcvf_hw *hw, u8 status)
 
 void ifcvf_reset(struct ifcvf_hw *hw)
 {
-   hw->config_cb.callback = NULL;
-   hw->config_cb.private = NULL;
-
ifcvf_set_status(hw, 0);
-   /* flush set_status, make sure VF is stopped, reset */
-   ifcvf_get_status(hw);
+   while (ifcvf_get_status(hw))
+   msleep(1);
 }
 
 u64 ifcvf_get_hw_features(struct ifcvf_hw *hw)
@@ -368,20 +365,42 @@ void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, 
bool ready)
vp_iowrite16(ready, &cfg->queue_enable);
 }
 
-static void ifcvf_hw_disable(struct ifcvf_hw *hw)
+static void ifcvf_reset_vring(struct ifcvf_hw *hw)
 {
-   u32 i;
+   u16 qid;
+
+   for (qid = 0; qid < hw->nr_vring; qid++) {
+   hw->vring[qid].cb.callback = NULL;
+   hw->vring[qid].cb.private = NULL;
+   ifcvf_set_vq_vector(hw, qid, VIRTIO_MSI_NO_VECTOR);
+   }
+}
 
+static void ifcvf_reset_config_handler(struct ifcvf_hw *hw)
+{
+   hw->config_cb.callback = NULL;
+   hw->config_cb.private = NULL;
ifcvf_set_config_vector(hw, VIRTIO_MSI_NO_VECTOR);
-   for (i = 0; i < hw->nr_vring; i++) {
-   ifcvf_set_vq_vector(hw, i, VIRTIO_MSI_NO_VECTOR);
+}
+
+static void ifcvf_synchronize_irq(struct ifcvf_hw *hw)
+{
+   u32 nvectors = hw->num_msix_vectors;
+   struct pci_dev *pdev = hw->pdev;
+   int i, irq;
+
+   for (i = 0; i < nvectors; i++) {
+   irq = pci_irq_vector(pdev, i);
+   if (irq >= 0)
+   synchronize_irq(irq);
}
 }
 
 void ifcvf_stop_hw(struct ifcvf_hw *hw)
 {
-   ifcvf_hw_disable(hw);
-   ifcvf_reset(hw);
+   ifcvf_synchronize_irq(hw);
+   ifcvf_reset_vring(hw);
+   ifcvf_reset_config_handler(hw);
 }
 
 void ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index d34d3bc0dbf4..7430f80779be 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -82,6 +82,7 @@ struct ifcvf_hw {
int vqs_reused_irq;
u16 nr_vring;
/* VIRTIO_PCI_CAP_DEVICE_CFG size */
+   u32 num_msix_vectors;
u32 cap_dev_config_size;
struct pci_dev *pdev;
 };
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 968687159e44..3401b9901dd2 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -125,6 +125,7 @@ static void ifcvf_free_irq(struct ifcvf_hw *vf)
ifcvf_free_vq_irq(vf);
ifcvf_free_config_irq(vf);
ifcvf_free_irq_vectors(pdev);
+   vf->num_msix_vectors = 0;
 }
 
 /* ifcvf MSIX vectors allocator, this helper tries to allocate
@@ -343,36 +344,11 @@ static int ifcvf_request_irq(struct ifcvf_hw *vf)
if (ret)
return ret;
 
-   return 0;
-}
-
-static int ifcvf_stop_datapath(struct ifcvf_adapter *adapter)
-{
-   struct ifcvf_hw *vf = adapter->vf;
-   int i;
-
-   for (i = 0; i < vf->nr_vring; i++)
-   vf->vring[i].cb.callback = NULL;
-
-   ifcvf_stop_hw(vf);
+   vf->num_msix_vectors = nvectors;
 
return 0;
 }
 
-static void ifcvf_reset_vring(struct ifcvf_adapter *adapter)
-{
-   struct ifcvf_hw *vf = adapter->vf;
-   int i;
-
-   for (i = 0; i < vf->nr_vring; i++) {
-   vf->vring[i].last_avail_idx = 0;
-   vf->vring[i].cb.callback = NULL;
-   vf->vring[i].cb.private = NULL;
-   }
-
-   ifcvf_reset(vf);
-}
-
 static struct ifcvf_adapter *vdpa_to_adapter(struct vdpa_device *vdpa_dev)
 {
return container_of(vdpa_dev, struct ifcvf_adapter, vdpa);
@@ -462,23 +438,15 @@ static void ifcvf_vdpa_set_status(struct vdpa_device 
*vdpa_dev, u8 status)
 
 static int ifcvf_vdpa_reset(struct vdpa_device *vdpa_dev)
 {
-   struct ifcvf_adapter *adapter;
-   struct ifcvf_hw *vf;
-   u8 status_old;
-
-   vf  = vdpa_to_vf(vdpa_dev);
-   adapter = vdpa_to_adapter(vdpa_dev);
-   status_old = ifcvf_get_status(vf);
+   struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev);
+   u8 status = ifcvf_get_status(vf);
 
-   if (status_old == 0)
-   return 0;
+   ifcvf_stop_hw(vf);
 
-   if (status_old & VIRTIO_CONFIG_S_DRIVER_OK) {
-   ifcvf_stop_datapath(adapter

[PATCH V2 5/5] vDPA/ifcvf: a vendor driver should not set _CONFIG_S_FAILED

2023-05-08 Thread Zhu Lingshan
VIRTIO_CONFIG_S_FAILED indicates the guest driver has given up
the device due to fatal errors. So it is the guest decision,
the vendor driver should not set this status to the device.

Signed-off-by: Zhu Lingshan 
Acked-by: Jason Wang 
---
 drivers/vdpa/ifcvf/ifcvf_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 3401b9901dd2..b413688e13c4 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -426,9 +426,7 @@ static void ifcvf_vdpa_set_status(struct vdpa_device 
*vdpa_dev, u8 status)
!(status_old & VIRTIO_CONFIG_S_DRIVER_OK)) {
ret = ifcvf_request_irq(vf);
if (ret) {
-   status = ifcvf_get_status(vf);
-   status |= VIRTIO_CONFIG_S_FAILED;
-   ifcvf_set_status(vf, status);
+   IFCVF_ERR(vf->pdev, "failed to request irq with error 
%d\n", ret);
return;
}
}
-- 
2.39.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V2 1/5] vDPA/ifcvf: virt queue ops take immediate actions

2023-05-08 Thread Zhu Lingshan
In this commit, virtqueue operations including:
set_vq_num(), set_vq_address(), set_vq_ready()
and get_vq_ready() access PCI registers directly
to take immediate actions.

Signed-off-by: Zhu Lingshan 
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 58 -
 drivers/vdpa/ifcvf/ifcvf_base.h | 10 +++---
 drivers/vdpa/ifcvf/ifcvf_main.c | 16 +++--
 3 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 5563b3a773c7..6c5650f73007 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -329,31 +329,49 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 
num)
return 0;
 }
 
-static int ifcvf_hw_enable(struct ifcvf_hw *hw)
+void ifcvf_set_vq_num(struct ifcvf_hw *hw, u16 qid, u32 num)
 {
-   struct virtio_pci_common_cfg __iomem *cfg;
-   u32 i;
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
 
-   cfg = hw->common_cfg;
-   for (i = 0; i < hw->nr_vring; i++) {
-   if (!hw->vring[i].ready)
-   break;
+   vp_iowrite16(qid, &cfg->queue_select);
+   vp_iowrite16(num, &cfg->queue_size);
+}
 
-   vp_iowrite16(i, &cfg->queue_select);
-   vp_iowrite64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
-&cfg->queue_desc_hi);
-   vp_iowrite64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo,
- &cfg->queue_avail_hi);
-   vp_iowrite64_twopart(hw->vring[i].used, &cfg->queue_used_lo,
-&cfg->queue_used_hi);
-   vp_iowrite16(hw->vring[i].size, &cfg->queue_size);
-   ifcvf_set_vq_state(hw, i, hw->vring[i].last_avail_idx);
-   vp_iowrite16(1, &cfg->queue_enable);
-   }
+int ifcvf_set_vq_address(struct ifcvf_hw *hw, u16 qid, u64 desc_area,
+u64 driver_area, u64 device_area)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+
+   vp_iowrite16(qid, &cfg->queue_select);
+   vp_iowrite64_twopart(desc_area, &cfg->queue_desc_lo,
+&cfg->queue_desc_hi);
+   vp_iowrite64_twopart(driver_area, &cfg->queue_avail_lo,
+&cfg->queue_avail_hi);
+   vp_iowrite64_twopart(device_area, &cfg->queue_used_lo,
+&cfg->queue_used_hi);
 
return 0;
 }
 
+bool ifcvf_get_vq_ready(struct ifcvf_hw *hw, u16 qid)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+   u16 queue_enable;
+
+   vp_iowrite16(qid, &cfg->queue_select);
+   queue_enable = vp_ioread16(&cfg->queue_enable);
+
+   return (bool)queue_enable;
+}
+
+void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, bool ready)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+
+   vp_iowrite16(qid, &cfg->queue_select);
+   vp_iowrite16(ready, &cfg->queue_enable);
+}
+
 static void ifcvf_hw_disable(struct ifcvf_hw *hw)
 {
u32 i;
@@ -366,16 +384,12 @@ static void ifcvf_hw_disable(struct ifcvf_hw *hw)
 
 int ifcvf_start_hw(struct ifcvf_hw *hw)
 {
-   ifcvf_reset(hw);
ifcvf_add_status(hw, VIRTIO_CONFIG_S_ACKNOWLEDGE);
ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER);
 
if (ifcvf_config_features(hw) < 0)
return -EINVAL;
 
-   if (ifcvf_hw_enable(hw) < 0)
-   return -EINVAL;
-
ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER_OK);
 
return 0;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index c20d1c40214e..d545a9411143 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -47,12 +47,7 @@
 #define MSIX_VECTOR_DEV_SHARED 3
 
 struct vring_info {
-   u64 desc;
-   u64 avail;
-   u64 used;
-   u16 size;
u16 last_avail_idx;
-   bool ready;
void __iomem *notify_addr;
phys_addr_t notify_pa;
u32 irq;
@@ -137,4 +132,9 @@ int ifcvf_probed_virtio_net(struct ifcvf_hw *hw);
 u32 ifcvf_get_config_size(struct ifcvf_hw *hw);
 u16 ifcvf_set_vq_vector(struct ifcvf_hw *hw, u16 qid, int vector);
 u16 ifcvf_set_config_vector(struct ifcvf_hw *hw, int vector);
+void ifcvf_set_vq_num(struct ifcvf_hw *hw, u16 qid, u32 num);
+int ifcvf_set_vq_address(struct ifcvf_hw *hw, u16 qid, u64 desc_area,
+u64 driver_area, u64 device_area);
+bool ifcvf_get_vq_ready(struct ifcvf_hw *hw, u16 qid);
+void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, bool ready);
 #endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 7f78c47e40d6..1357c67014ab 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -382,10 +382,6 @@ static void ifcvf_reset_vring(struct ifcvf_adapter 
*adapter)
 
for (i = 0; i < vf->nr_vring; i+

[PATCH V2 2/5] vDPA/ifcvf: get_driver_features from virtio registers

2023-05-08 Thread Zhu Lingshan
This commit implements a new function ifcvf_get_driver_feature()
which read driver_features from virtio registers.

To be less ambiguous, ifcvf_set_features() is renamed to
ifcvf_set_driver_features(), and ifcvf_get_features()
is renamed to ifcvf_get_dev_features() which returns
the provisioned vDPA device features.

Signed-off-by: Zhu Lingshan 
---
 drivers/vdpa/ifcvf/ifcvf_base.c | 38 +
 drivers/vdpa/ifcvf/ifcvf_base.h |  5 +++--
 drivers/vdpa/ifcvf/ifcvf_main.c |  9 +---
 3 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c
index 6c5650f73007..546e923bcd16 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.c
+++ b/drivers/vdpa/ifcvf/ifcvf_base.c
@@ -204,11 +204,29 @@ u64 ifcvf_get_hw_features(struct ifcvf_hw *hw)
return features;
 }
 
-u64 ifcvf_get_features(struct ifcvf_hw *hw)
+/* return provisioned vDPA dev features */
+u64 ifcvf_get_dev_features(struct ifcvf_hw *hw)
 {
return hw->dev_features;
 }
 
+u64 ifcvf_get_driver_features(struct ifcvf_hw *hw)
+{
+   struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
+   u32 features_lo, features_hi;
+   u64 features;
+
+   vp_iowrite32(0, &cfg->device_feature_select);
+   features_lo = vp_ioread32(&cfg->guest_feature);
+
+   vp_iowrite32(1, &cfg->device_feature_select);
+   features_hi = vp_ioread32(&cfg->guest_feature);
+
+   features = ((u64)features_hi << 32) | features_lo;
+
+   return features;
+}
+
 int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features)
 {
if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)) && features) {
@@ -275,7 +293,7 @@ void ifcvf_write_dev_config(struct ifcvf_hw *hw, u64 offset,
vp_iowrite8(*p++, hw->dev_cfg + offset + i);
 }
 
-static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features)
+void ifcvf_set_driver_features(struct ifcvf_hw *hw, u64 features)
 {
struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg;
 
@@ -286,19 +304,6 @@ static void ifcvf_set_features(struct ifcvf_hw *hw, u64 
features)
vp_iowrite32(features >> 32, &cfg->guest_feature);
 }
 
-static int ifcvf_config_features(struct ifcvf_hw *hw)
-{
-   ifcvf_set_features(hw, hw->req_features);
-   ifcvf_add_status(hw, VIRTIO_CONFIG_S_FEATURES_OK);
-
-   if (!(ifcvf_get_status(hw) & VIRTIO_CONFIG_S_FEATURES_OK)) {
-   IFCVF_ERR(hw->pdev, "Failed to set FEATURES_OK status\n");
-   return -EIO;
-   }
-
-   return 0;
-}
-
 u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid)
 {
struct ifcvf_lm_cfg __iomem *ifcvf_lm;
@@ -387,9 +392,6 @@ int ifcvf_start_hw(struct ifcvf_hw *hw)
ifcvf_add_status(hw, VIRTIO_CONFIG_S_ACKNOWLEDGE);
ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER);
 
-   if (ifcvf_config_features(hw) < 0)
-   return -EINVAL;
-
ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER_OK);
 
return 0;
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index d545a9411143..cb19196c3ece 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -69,7 +69,6 @@ struct ifcvf_hw {
phys_addr_t notify_base_pa;
u32 notify_off_multiplier;
u32 dev_type;
-   u64 req_features;
u64 hw_features;
/* provisioned device features */
u64 dev_features;
@@ -122,7 +121,7 @@ u8 ifcvf_get_status(struct ifcvf_hw *hw);
 void ifcvf_set_status(struct ifcvf_hw *hw, u8 status);
 void io_write64_twopart(u64 val, u32 *lo, u32 *hi);
 void ifcvf_reset(struct ifcvf_hw *hw);
-u64 ifcvf_get_features(struct ifcvf_hw *hw);
+u64 ifcvf_get_dev_features(struct ifcvf_hw *hw);
 u64 ifcvf_get_hw_features(struct ifcvf_hw *hw);
 int ifcvf_verify_min_features(struct ifcvf_hw *hw, u64 features);
 u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid);
@@ -137,4 +136,6 @@ int ifcvf_set_vq_address(struct ifcvf_hw *hw, u16 qid, u64 
desc_area,
 u64 driver_area, u64 device_area);
 bool ifcvf_get_vq_ready(struct ifcvf_hw *hw, u16 qid);
 void ifcvf_set_vq_ready(struct ifcvf_hw *hw, u16 qid, bool ready);
+void ifcvf_set_driver_features(struct ifcvf_hw *hw, u64 features);
+u64 ifcvf_get_driver_features(struct ifcvf_hw *hw);
 #endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 1357c67014ab..4588484bd53d 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -410,7 +410,7 @@ static u64 ifcvf_vdpa_get_device_features(struct 
vdpa_device *vdpa_dev)
u64 features;
 
if (type == VIRTIO_ID_NET || type == VIRTIO_ID_BLOCK)
-   features = ifcvf_get_features(vf);
+   features = ifcvf_get_dev_features(vf);
else {
features = 0;
IFCVF_ERR(pdev, "VIRTIO ID %u not supported\n", vf->dev_type);
@@ -428,7 +428,7 @@ static int ifcvf_vdpa_set_driver_

[PATCH V2 0/5] vDPA/ifcvf: implement immediate initialization mechanism

2023-05-08 Thread Zhu Lingshan
Formerly, ifcvf driver has implemented a lazy-initialization mechanism
for the virtqueues and other config space contents,
it would store all configurations that passed down from the userspace,
then load them to the device config space upon DRIVER_OK.

This can not serve live migration, so this series implement an
immediate initialization mechanism, which means rather than the
former store-load process, the virtio operations like vq ops
would take immediate actions by access the virtio registers.

This series also implement irq synchronization in the reset
routine

Changes from V1:
1)pull device status in devce_reset (Jason)
2)simplify the procedure which sycn irqs (Jason)
3)fix typos(Michael)

Zhu Lingshan (5):
  vDPA/ifcvf: virt queue ops take immediate actions
  vDPA/ifcvf: get_driver_features from virtio registers
  vDPA/ifcvf: retire ifcvf_start_datapath and ifcvf_add_status
  vDPA/ifcvf: synchronize irqs in the reset routine
  vDPA/ifcvf: a vendor driver should not set _CONFIG_S_FAILED

 drivers/vdpa/ifcvf/ifcvf_base.c | 146 ++--
 drivers/vdpa/ifcvf/ifcvf_base.h |  17 ++--
 drivers/vdpa/ifcvf/ifcvf_main.c |  98 -
 3 files changed, 108 insertions(+), 153 deletions(-)

-- 
2.39.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: set default mtu to 1500 when 'Device maximum MTU' bigger than 1500

2023-05-08 Thread Xuan Zhuo
On Mon, 8 May 2023 02:43:24 -0400, "Michael S. Tsirkin"  wrote:
> On Mon, May 08, 2023 at 02:18:08PM +0800, Xuan Zhuo wrote:
> > On Mon, 8 May 2023 02:15:46 -0400, "Michael S. Tsirkin"  
> > wrote:
> > > On Mon, May 08, 2023 at 10:01:59AM +0800, Xuan Zhuo wrote:
> > > > On Sun, 7 May 2023 04:58:58 -0400, "Michael S. Tsirkin" 
> > > >  wrote:
> > > > > On Sat, May 06, 2023 at 04:56:35PM +0800, Hao Chen wrote:
> > > > > >
> > > > > >
> > > > > > 在 2023/5/6 10:50, Xuan Zhuo 写道:
> > > > > > > On Sat,  6 May 2023 10:15:29 +0800, Hao Chen  
> > > > > > > wrote:
> > > > > > > > When VIRTIO_NET_F_MTU(3) Device maximum MTU reporting is 
> > > > > > > > supported.
> > > > > > > > If offered by the device, device advises driver about the value 
> > > > > > > > of its
> > > > > > > > maximum MTU. If negotiated, the driver uses mtu as the maximum
> > > > > > > > MTU value. But there the driver also uses it as default mtu,
> > > > > > > > some devices may have a maximum MTU greater than 1500, this may
> > > > > > > > cause some large packages to be discarded,
> > > > > > >
> > > > > > > You mean tx packet?
> > > > > > Yes.
> > > > > > >
> > > > > > > If yes, I do not think this is the problem of driver.
> > > > > > >
> > > > > > > Maybe you should give more details about the discard.
> > > > > > >
> > > > > > In the current code, if the maximum MTU supported by the virtio net 
> > > > > > hardware
> > > > > > is 9000, the default MTU of the virtio net driver will also be set 
> > > > > > to 9000.
> > > > > > When sending packets through "ping -s 5000", if the peer router 
> > > > > > does not
> > > > > > support negotiating a path MTU through ICMP packets, the packets 
> > > > > > will be
> > > > > > discarded. If the peer router supports negotiating path mtu through 
> > > > > > ICMP
> > > > > > packets, the host side will perform packet sharding processing 
> > > > > > based on the
> > > > > > negotiated path mtu, which is generally within 1500.
> > > > > > This is not a bugfix patch, I think setting the default mtu to 
> > > > > > within 1500
> > > > > > would be more suitable here.Thanks.
> > > > >
> > > > > I don't think VIRTIO_NET_F_MTU is appropriate for support for jumbo 
> > > > > packets.
> > > > > The spec says:
> > > > >   The device MUST forward transmitted packets of up to mtu (plus 
> > > > > low level ethernet header length) size with
> > > > >   gso_type NONE or ECN, and do so without fragmentation, after 
> > > > > VIRTIO_NET_F_MTU has been success-
> > > > >   fully negotiated.
> > > > > VIRTIO_NET_F_MTU has been designed for all kind of tunneling devices,
> > > > > and this is why we set mtu to max by default.
> > > > >
> > > > > For things like jumbo frames where MTU might or might not be 
> > > > > available,
> > > > > a new feature would be more appropriate.
> > > >
> > > >
> > > > So for jumbo frame, what is the problem?
> > > >
> > > > We are trying to do this. @Heng
> > > >
> > > > Thanks.
> > >
> > > It is not a problem as such. But VIRTIO_NET_F_MTU will set the
> > > default MTU not just the maximum one, because spec seems to
> > > say it can.
> >
> > I see.
> >
> > In the case of Jumbo Frame, we also hope that the driver will set the 
> > default
> > directly to the max mtu. Just like what you said "Bigger packets = better
> > performance."
> >
> > I don't know, in any scenario, when the hardware supports a large mtu, but 
> > we do
> > not want the user to use it by default.
>
> When other devices on the same LAN have mtu set to 1500 and
> won't accept bigger packets.

So, that depends on pmtu/tcp-probe-mtu.

If the os without pmtu/tcp-probe-mtu has a bigger mtu, then it's big packet
will lost.

Thanks.


>
> > Of course, the scene that this patch
> > wants to handle does exist, but I have never thought that this is a problem 
> > at
> > the driver level.
> >
> > Thanks.
> >
> >
> > >
> > >
> > > >
> > > > >
> > > > > > > > so I changed the MTU to a more
> > > > > > > > general 1500 when 'Device maximum MTU' bigger than 1500.
> > > > > > > >
> > > > > > > > Signed-off-by: Hao Chen 
> > > > > > > > ---
> > > > > > > >   drivers/net/virtio_net.c | 5 -
> > > > > > > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > > > index 8d8038538fc4..e71c7d1b5f29 100644
> > > > > > > > --- a/drivers/net/virtio_net.c
> > > > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > > > @@ -4040,7 +4040,10 @@ static int virtnet_probe(struct 
> > > > > > > > virtio_device *vdev)
> > > > > > > > goto free;
> > > > > > > > }
> > > > > > > >
> > > > > > > > -   dev->mtu = mtu;
> > > > > > > > +   if (mtu > 1500)
> > > > > > >
> > > > > > > s/1500/ETH_DATA_LEN/
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > > +   dev->mtu = 1500;
> > > > > > > > +   else
> > > > > > > > +