Re: [PATCH v6] virtio-net: Fix network stall at the host side waiting for kick

2024-06-30 Thread Michael S. Tsirkin
Thanks for the patch!
Yes something to improve:


On Sun, Jun 30, 2024 at 02:36:15PM +0800, Wencheng Yang wrote:
> From: thomas 
> 
> Patch 06b12970174 ("virtio-net: fix network stall under load")
> added double-check to test whether the available buffer size
> can satisfy the request or not, in case the guest has added
> some buffers to the avail ring simultaneously after the first
> check. It will be lucky if the available buffer size becomes
> okay after the double-check, then the host can send the packet
> to the guest. If the buffer size still can't satisfy the request,
> even if the guest has added some buffers, viritio-net would
> stall at the host side forever.
> 
> The patch checks whether the guest has added some buffers
> after last check of avail idx when the available buffers are
> not sufficient, if so then recheck the available buffers
> in the loop.
> 
> The patch also reverts patch "06b12970174".
> 
> The case below can reproduce the stall.
> 
>Guest 0
>  ++
>  | iperf  |
> ---> | server |
>  Host   |++
>++   |...
>| iperf  |
>| client |  Guest n
>++   |++
> || iperf  |
> ---> | server |
>  ++
> 
> Boot many guests from qemu with virtio network:
>  qemu ... -netdev tap,id=net_x \
> -device virtio-net-pci-non-transitional,\
> iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x
> 
> Each guest acts as iperf server with commands below:
>  iperf3 -s -D -i 10 -p 8001
>  iperf3 -s -D -i 10 -p 8002
> 
> The host as iperf client:
>  iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 4
>  iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 4
> 
> After some time, the host loses connection to the guest,
> the guest can send packet to the host, but can't receive
> packet from host.
> 
> It's more likely to happen if SWIOTLB is enabled in the guest,
> allocating and freeing bounce buffer takes some CPU ticks,
> copying from/to bounce buffer takes more CPU ticks, compared
> with that there is no bounce buffer in the guest.
> Once the rate of producing packets from the host approximates
> the rate of receiveing packets in the guest, the guest would
> loop in NAPI.
> 
>  receive packets---
>| |
>v |
>free buf  virtnet_poll
>| |
>v |
>  add buf to avail ring  ---
>|
>|  need kick the host?
>|  NAPI continues
>v
>  receive packets---
>| |
>v |
>free buf  virtnet_poll
>| |
>v |
>  add buf to avail ring  ---
>|
>v
>   ...   ...
> 
> On the other hand, the host fetches free buf from avail
> ring, if the buf in the avail ring is not enough, the
> host notifies the guest the event by writing the avail
> idx read from avail ring to the event idx of used ring,
> then the host goes to sleep, waiting for the kick signal
> from the guest.
> 
> Once the guest finds the host is waiting for kick singal
> (in virtqueue_kick_prepare_split()), it kicks the host.
> 
> The host may stall forever at the sequences below:
> 
>  HostGuest
>   ---
>  fetch buf, send packet   receive packet ---
>  ...  ... |
>  fetch buf, send packet add buf   |
>  ...add buf   virtnet_poll
> buf not enough  avail idx-> add buf   |
> read avail idx  add buf   |
> add buf  ---
>   receive packet ---
> write event idx   ... |
> waiting for kick add buf   virtnet_poll
>   ... |
>  ---
>  no more packet, exit NAPI
> 
> In the first loop of NAPI above, indicated in the range of
> virtnet_poll above, the host is sending packets while the
> guest is receiving packets and adding buf.
>  step 1: The buf is not enough, for example, a big packet
>  needs 5 buf, but the available buf count is 3.
>  The host read current avail idx.
>  step 2: The guest adds some buf, then checks whether the
>  host is waiting for kick signal, not at this time.
>  The used ring is not emp

[PATCH v6] virtio-net: Fix network stall at the host side waiting for kick

2024-06-29 Thread Wencheng Yang
From: thomas 

Patch 06b12970174 ("virtio-net: fix network stall under load")
added double-check to test whether the available buffer size
can satisfy the request or not, in case the guest has added
some buffers to the avail ring simultaneously after the first
check. It will be lucky if the available buffer size becomes
okay after the double-check, then the host can send the packet
to the guest. If the buffer size still can't satisfy the request,
even if the guest has added some buffers, viritio-net would
stall at the host side forever.

The patch checks whether the guest has added some buffers
after last check of avail idx when the available buffers are
not sufficient, if so then recheck the available buffers
in the loop.

The patch also reverts patch "06b12970174".

The case below can reproduce the stall.

   Guest 0
 ++
 | iperf  |
---> | server |
 Host   |++
   ++   |...
   | iperf  |
   | client |  Guest n
   ++   |++
|| iperf  |
---> | server |
 ++

Boot many guests from qemu with virtio network:
 qemu ... -netdev tap,id=net_x \
-device virtio-net-pci-non-transitional,\
iommu_platform=on,mac=xx:xx:xx:xx:xx:xx,netdev=net_x

Each guest acts as iperf server with commands below:
 iperf3 -s -D -i 10 -p 8001
 iperf3 -s -D -i 10 -p 8002

The host as iperf client:
 iperf3 -c guest_IP -p 8001 -i 30 -w 256k -P 20 -t 4
 iperf3 -c guest_IP -p 8002 -i 30 -w 256k -P 20 -t 4

After some time, the host loses connection to the guest,
the guest can send packet to the host, but can't receive
packet from host.

It's more likely to happen if SWIOTLB is enabled in the guest,
allocating and freeing bounce buffer takes some CPU ticks,
copying from/to bounce buffer takes more CPU ticks, compared
with that there is no bounce buffer in the guest.
Once the rate of producing packets from the host approximates
the rate of receiveing packets in the guest, the guest would
loop in NAPI.

 receive packets---
   | |
   v |
   free buf  virtnet_poll
   | |
   v |
 add buf to avail ring  ---
   |
   |  need kick the host?
   |  NAPI continues
   v
 receive packets---
   | |
   v |
   free buf  virtnet_poll
   | |
   v |
 add buf to avail ring  ---
   |
   v
  ...   ...

On the other hand, the host fetches free buf from avail
ring, if the buf in the avail ring is not enough, the
host notifies the guest the event by writing the avail
idx read from avail ring to the event idx of used ring,
then the host goes to sleep, waiting for the kick signal
from the guest.

Once the guest finds the host is waiting for kick singal
(in virtqueue_kick_prepare_split()), it kicks the host.

The host may stall forever at the sequences below:

 HostGuest
  ---
 fetch buf, send packet   receive packet ---
 ...  ... |
 fetch buf, send packet add buf   |
 ...add buf   virtnet_poll
buf not enough  avail idx-> add buf   |
read avail idx  add buf   |
add buf  ---
  receive packet ---
write event idx   ... |
waiting for kick add buf   virtnet_poll
  ... |
 ---
 no more packet, exit NAPI

In the first loop of NAPI above, indicated in the range of
virtnet_poll above, the host is sending packets while the
guest is receiving packets and adding buf.
 step 1: The buf is not enough, for example, a big packet
 needs 5 buf, but the available buf count is 3.
 The host read current avail idx.
 step 2: The guest adds some buf, then checks whether the
 host is waiting for kick signal, not at this time.
 The used ring is not empty, the guest continues
 the second loop of NAPI.
 step 3: The host writes the avail idx read from avail
 ring to used ring as event idx via
 virtio_queue_set_notification(q->rx_vq, 1).
 step 4: At the end of the second loop of NAPI, recheck
 whether kick is needed, as the event idx in the
 used ring written