Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
On Sun, Apr 30, 2023 at 04:15:15PM +0300, Alvaro Karsz wrote: > At the moment, if a virtio network device uses vrings with less than > MAX_SKB_FRAGS + 2 entries, the device won't be functional. > > The following condition vq->num_free >= 2 + MAX_SKB_FRAGS will always > evaluate to false, leading to TX timeouts. > > This patchset attempts this fix this bug, and to allow small rings down > to 4 entries. > > The first patch introduces a new mechanism in virtio core - it allows to > block features in probe time. > > If a virtio drivers blocks features and fails probe, virtio core will > reset the device, re-negotiate the features and probe again. > > This is needed since some virtio net features are not supported with > small rings. > > This patchset follows a discussion in the mailing list [1]. > > This fixes only part of the bug, rings with less than 4 entries won't > work. > My intention is to split the effort and fix the RING_SIZE < 4 case in a > follow up patchset. > > Maybe we should fail probe if RING_SIZE < 4 until the follow up patchset? > > I tested the patchset with SNET DPU (drivers/vdpa/solidrun), with packed > and split VQs, with rings down to 4 entries, with and without > VIRTIO_NET_F_MRG_RXBUF, with big MTUs. > > I would appreciate more testing. > Xuan: I wasn't able to test XDP with my setup, maybe you can help with > that? > > [1] > https://lore.kernel.org/lkml/20230416074607.292616-1-alvaro.ka...@solid-run.com/ the work is orphaned for now. Jason do you want to pick this up? Related to all the hardening I guess ... > Alvaro Karsz (3): > virtio: re-negotiate features if probe fails and features are blocked > virtio-net: allow usage of vrings smaller than MAX_SKB_FRAGS + 2 > virtio-net: block ethtool from converting a ring to a small ring > > drivers/net/virtio_net.c | 161 +-- > drivers/virtio/virtio.c | 73 +- > include/linux/virtio.h | 3 + > 3 files changed, 212 insertions(+), 25 deletions(-) > > -- > 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
On Mon, May 01, 2023 at 11:41:44AM +, Alvaro Karsz wrote: > > > > Why the difference? > > > > > > > > > > Because the RING_SIZE < 4 case requires much more adjustments. > > > > > > * We may need to squeeze the virtio header into the headroom. > > > * We may need to squeeze the GSO header into the headroom, or block the > > > features. > > > > We alread do this though no? > > I think we'll need to tweak hard_header_len to guarantee it's there > > as opposed to needed_headroom ... > > > > > * At the moment, without NETIF_F_SG, we can receive a skb with 2 > > > segments, we may need to reduce it to 1. > > > > You are saying clearing NETIF_F_SG does not guarantee a linear skb? > > > > I don't know.. > I'm not sure what is the cause, but using this patchset, without any host GSO > feature, I can get a chain of 3 descriptors. > Posing an example of a 4 entries ring during iperf3, acting as a client: > (TX descriptors) > > len=86 flags 0x1 addr 0xf738d000 > len=1448 flags 0x0 addr 0xf738d800 > len=86 flags 0x8081 addr 0xf738e000 > len=1184, flags 0x8081 addr 0xf738e800 > len=264 flags 0x8080 addr 0xf738f000 > len=86 flags 0x8081 addr 0xf738f800 > len=1448 flags 0x0 addr 0xf739 > len=86 flags 0x1 addr 0xf7390800 > len=1448 flags 0x0 addr 0xf7391000 > len=86 flags 0x1 addr 0xf716a800 > len=1448 flags 0x8080 addr 0xf716b000 > len=86 flags 0x8081 addr 0xf7391800 > len=1448 flags 0x8080 addr 0xf7392000 > > We got a chain of 3 in here. > This happens often. > > Now, when negotiating host GSO features, I can get up to 4: > > len=86 flags 0x1 addr 0xf71fc800 > len=21328 flags 0x1 addr 0xf6e00800 > len=32768 flags 0x8081 addr 0xf6e06000 > len=11064 flags 0x8080 addr 0xf6e0e000 > len=86 flags 0x8081 addr 0xf738b000 > len=1 flags 0x8080 addr 0xf738b800 > len=86 flags 0x1 addr 0xf738c000 > len=21704 flags 0x1 addr 0xf738c800 > len=32768 flags 0x1 addr 0xf7392000 > len=10688 flags 0x0 addr 0xf739a000 > len=86 flags 0x8081 addr 0xf739d000 > len=22080 flags 0x8081 addr 0xf739d800 > len=32768 flags 0x8081 addr 0xf73a3000 > len=10312 flags 0x8080 addr 0xf73ab000 > > TBH, I thought that this behaviour was expected until you mentioned it, > This is why virtnet_calc_max_descs returns 3 if no host_gso feature is > negotiated, and 4 otherwise. > I was thinking that we may need to use another skb to hold the TSO template > (for headers generation)... > > Any ideas? Something's wrong with the code apparently. Want to try sticking printk's in the driver to see what is going on? Documentation/networking/netdev-features.rst says: Those features say that ndo_start_xmit can handle fragmented skbs: NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST --- chained skbs (skb->next/prev list). of course we can always linearize if we want to ... > > > * We may need to change all the control commands, so class, command and > > > command specific data will fit in a single segment. > > > * We may need to disable the control command and all the features > > > depending on it. > > > > well if we don't commands just fail as we can't add them right? > > no corruption or stalls ... > > > > > * We may need to disable NAPI? > > > > hmm why napi? > > > > I'm not sure if it's required to disable it, but I'm not sure what's the > point having napi if the ring size is 1.. > Will it work? Not much point in it but it's simpler to just keep things consistent. > > > There may be more changes.. > > > > > > I was thinking that it may be easier to start with the easier case > > > RING_SIZE >= 4, make sure everything is working fine, then send a follow > > > up patchset with the required adjustments for RING_SIZE < 4. > > > > > > it's ok but I'm just trying to figure out where does 4 come from. > > > > I guess this part was not clear, sorry.. > In case of split vqs, we have ring size 2 before 4. > And as you saw, we still get chains of 3 when NETIF_F_SG is off (Which I > thought was expected). > Worth figuring out where do these come from. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
> > > Why the difference? > > > > > > > Because the RING_SIZE < 4 case requires much more adjustments. > > > > * We may need to squeeze the virtio header into the headroom. > > * We may need to squeeze the GSO header into the headroom, or block the > > features. > > We alread do this though no? > I think we'll need to tweak hard_header_len to guarantee it's there > as opposed to needed_headroom ... > > > * At the moment, without NETIF_F_SG, we can receive a skb with 2 segments, > > we may need to reduce it to 1. > > You are saying clearing NETIF_F_SG does not guarantee a linear skb? > I don't know.. I'm not sure what is the cause, but using this patchset, without any host GSO feature, I can get a chain of 3 descriptors. Posing an example of a 4 entries ring during iperf3, acting as a client: (TX descriptors) len=86 flags 0x1 addr 0xf738d000 len=1448 flags 0x0 addr 0xf738d800 len=86 flags 0x8081 addr 0xf738e000 len=1184, flags 0x8081 addr 0xf738e800 len=264 flags 0x8080 addr 0xf738f000 len=86 flags 0x8081 addr 0xf738f800 len=1448 flags 0x0 addr 0xf739 len=86 flags 0x1 addr 0xf7390800 len=1448 flags 0x0 addr 0xf7391000 len=86 flags 0x1 addr 0xf716a800 len=1448 flags 0x8080 addr 0xf716b000 len=86 flags 0x8081 addr 0xf7391800 len=1448 flags 0x8080 addr 0xf7392000 We got a chain of 3 in here. This happens often. Now, when negotiating host GSO features, I can get up to 4: len=86 flags 0x1 addr 0xf71fc800 len=21328 flags 0x1 addr 0xf6e00800 len=32768 flags 0x8081 addr 0xf6e06000 len=11064 flags 0x8080 addr 0xf6e0e000 len=86 flags 0x8081 addr 0xf738b000 len=1 flags 0x8080 addr 0xf738b800 len=86 flags 0x1 addr 0xf738c000 len=21704 flags 0x1 addr 0xf738c800 len=32768 flags 0x1 addr 0xf7392000 len=10688 flags 0x0 addr 0xf739a000 len=86 flags 0x8081 addr 0xf739d000 len=22080 flags 0x8081 addr 0xf739d800 len=32768 flags 0x8081 addr 0xf73a3000 len=10312 flags 0x8080 addr 0xf73ab000 TBH, I thought that this behaviour was expected until you mentioned it, This is why virtnet_calc_max_descs returns 3 if no host_gso feature is negotiated, and 4 otherwise. I was thinking that we may need to use another skb to hold the TSO template (for headers generation)... Any ideas? > > * We may need to change all the control commands, so class, command and > > command specific data will fit in a single segment. > > * We may need to disable the control command and all the features depending > > on it. > > well if we don't commands just fail as we can't add them right? > no corruption or stalls ... > > > * We may need to disable NAPI? > > hmm why napi? > I'm not sure if it's required to disable it, but I'm not sure what's the point having napi if the ring size is 1.. Will it work? > > There may be more changes.. > > > > I was thinking that it may be easier to start with the easier case > > RING_SIZE >= 4, make sure everything is working fine, then send a follow up > > patchset with the required adjustments for RING_SIZE < 4. > > > it's ok but I'm just trying to figure out where does 4 come from. > I guess this part was not clear, sorry.. In case of split vqs, we have ring size 2 before 4. And as you saw, we still get chains of 3 when NETIF_F_SG is off (Which I thought was expected). ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
On Sun, Apr 30, 2023 at 06:15:03PM +, Alvaro Karsz wrote: > > > > This patchset follows a discussion in the mailing list [1]. > > > > > > This fixes only part of the bug, rings with less than 4 entries won't > > > work. > > > > Why the difference? > > > > Because the RING_SIZE < 4 case requires much more adjustments. > > * We may need to squeeze the virtio header into the headroom. > * We may need to squeeze the GSO header into the headroom, or block the > features. We alread do this though no? I think we'll need to tweak hard_header_len to guarantee it's there as opposed to needed_headroom ... > * At the moment, without NETIF_F_SG, we can receive a skb with 2 segments, we > may need to reduce it to 1. You are saying clearing NETIF_F_SG does not guarantee a linear skb? > * We may need to change all the control commands, so class, command and > command specific data will fit in a single segment. > * We may need to disable the control command and all the features depending > on it. well if we don't commands just fail as we can't add them right? no corruption or stalls ... > * We may need to disable NAPI? hmm why napi? > There may be more changes.. > > I was thinking that it may be easier to start with the easier case RING_SIZE > >= 4, make sure everything is working fine, then send a follow up patchset > with the required adjustments for RING_SIZE < 4. it's ok but I'm just trying to figure out where does 4 come from. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
> > This patchset follows a discussion in the mailing list [1]. > > > > This fixes only part of the bug, rings with less than 4 entries won't > > work. > > Why the difference? > Because the RING_SIZE < 4 case requires much more adjustments. * We may need to squeeze the virtio header into the headroom. * We may need to squeeze the GSO header into the headroom, or block the features. * At the moment, without NETIF_F_SG, we can receive a skb with 2 segments, we may need to reduce it to 1. * We may need to change all the control commands, so class, command and command specific data will fit in a single segment. * We may need to disable the control command and all the features depending on it. * We may need to disable NAPI? There may be more changes.. I was thinking that it may be easier to start with the easier case RING_SIZE >= 4, make sure everything is working fine, then send a follow up patchset with the required adjustments for RING_SIZE < 4. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH net 0/3] virtio-net: allow usage of small vrings
On Sun, Apr 30, 2023 at 04:15:15PM +0300, Alvaro Karsz wrote: > At the moment, if a virtio network device uses vrings with less than > MAX_SKB_FRAGS + 2 entries, the device won't be functional. > > The following condition vq->num_free >= 2 + MAX_SKB_FRAGS will always > evaluate to false, leading to TX timeouts. > > This patchset attempts this fix this bug, and to allow small rings down > to 4 entries. > The first patch introduces a new mechanism in virtio core - it allows to > block features in probe time. > > If a virtio drivers blocks features and fails probe, virtio core will > reset the device, re-negotiate the features and probe again. > > This is needed since some virtio net features are not supported with > small rings. > > This patchset follows a discussion in the mailing list [1]. > > This fixes only part of the bug, rings with less than 4 entries won't > work. Why the difference? > My intention is to split the effort and fix the RING_SIZE < 4 case in a > follow up patchset. > > Maybe we should fail probe if RING_SIZE < 4 until the follow up patchset? I'd keep current behaviour. > I tested the patchset with SNET DPU (drivers/vdpa/solidrun), with packed > and split VQs, with rings down to 4 entries, with and without > VIRTIO_NET_F_MRG_RXBUF, with big MTUs. > > I would appreciate more testing. > Xuan: I wasn't able to test XDP with my setup, maybe you can help with > that? > > [1] > https://lore.kernel.org/lkml/20230416074607.292616-1-alvaro.ka...@solid-run.com/ > > Alvaro Karsz (3): > virtio: re-negotiate features if probe fails and features are blocked > virtio-net: allow usage of vrings smaller than MAX_SKB_FRAGS + 2 > virtio-net: block ethtool from converting a ring to a small ring > > drivers/net/virtio_net.c | 161 +-- > drivers/virtio/virtio.c | 73 +- > include/linux/virtio.h | 3 + > 3 files changed, 212 insertions(+), 25 deletions(-) > > -- > 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[RFC PATCH net 0/3] virtio-net: allow usage of small vrings
At the moment, if a virtio network device uses vrings with less than MAX_SKB_FRAGS + 2 entries, the device won't be functional. The following condition vq->num_free >= 2 + MAX_SKB_FRAGS will always evaluate to false, leading to TX timeouts. This patchset attempts this fix this bug, and to allow small rings down to 4 entries. The first patch introduces a new mechanism in virtio core - it allows to block features in probe time. If a virtio drivers blocks features and fails probe, virtio core will reset the device, re-negotiate the features and probe again. This is needed since some virtio net features are not supported with small rings. This patchset follows a discussion in the mailing list [1]. This fixes only part of the bug, rings with less than 4 entries won't work. My intention is to split the effort and fix the RING_SIZE < 4 case in a follow up patchset. Maybe we should fail probe if RING_SIZE < 4 until the follow up patchset? I tested the patchset with SNET DPU (drivers/vdpa/solidrun), with packed and split VQs, with rings down to 4 entries, with and without VIRTIO_NET_F_MRG_RXBUF, with big MTUs. I would appreciate more testing. Xuan: I wasn't able to test XDP with my setup, maybe you can help with that? [1] https://lore.kernel.org/lkml/20230416074607.292616-1-alvaro.ka...@solid-run.com/ Alvaro Karsz (3): virtio: re-negotiate features if probe fails and features are blocked virtio-net: allow usage of vrings smaller than MAX_SKB_FRAGS + 2 virtio-net: block ethtool from converting a ring to a small ring drivers/net/virtio_net.c | 161 +-- drivers/virtio/virtio.c | 73 +- include/linux/virtio.h | 3 + 3 files changed, 212 insertions(+), 25 deletions(-) -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization