On Thu, Jun 29, 2023 at 9:43 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>
> On 6/20/23 13:26, David Marchand wrote:
> > At some point in OVS history, some virtio features were announced as
> > supported (ECN and UFO virtio features).
> >
> > The userspace TSO code, which has been added later, does not support
> > those features and tries to disable them.
> >
> > This breaks OVS upgrades: if an existing VM already negotiated such
> > features, their lack on reconnection to an upgraded OVS triggers a
> > vhost socket disconnection by Qemu.
> > This results in an endless loop because Qemu then retries with the same
> > set of virtio features.
> >
> > This patch proposes to try and detect those vhost socket disconnection
> > and fallback restoring the old virtio features (and disabling TSO for this
> > vhost port).
>
> I suppose, even if we can't fix live migration case, this patch sill
> makes some sense, right?
>
> We will not be able to enable TSO by default though without some extra
> work, i.e. adding some knobs...

Sorry, I have been thinking about the live migration case and did not
reply yet on the other thread (well, I have no solution for it).

Handling a simple case of upgrading OVS could still be handled with
this current patch.
So I think it still makes sense to take this patch.


>
> Some comments inline.
>
> >
> > Signed-off-by: David Marchand <david.march...@redhat.com>
> > Acked-by: Mike Pattrick <m...@redhat.com>
> > ---
> > Note: resending this v4 for CI (I added Mike ack).
> >
> > Changelog since v3:
> > - updated documentation now that the interface offloads status is reported
> >   in ovsdb,
> > - fixed one coding style issue,
> >
> > Changelog since v2:
> > - reported workaround presence in the ovsdb port status field and
> >   updated documentation accordingly,
> > - tried to use "better" names, to distinguish ECN virtio feature from
> >   TSO OVS netdev feature,
> >
> > Changelog since v1:
> > - added a note in the documentation,
> > - fixed vhost unregister trigger (so that both disabling and re-enabling
> >   TSO is handled),
> > - cleared netdev features when disabling TSO,
> > - changed level and ratelimited log message on vhost socket disconnect,
> >
> > ---
> >  Documentation/topics/userspace-tso.rst | 30 ++++++++-
> >  lib/netdev-dpdk.c                      | 85 +++++++++++++++++++++++++-
> >  2 files changed, 111 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/topics/userspace-tso.rst 
> > b/Documentation/topics/userspace-tso.rst
> > index 5a43c2e86b..d4111f1ab2 100644
> > --- a/Documentation/topics/userspace-tso.rst
> > +++ b/Documentation/topics/userspace-tso.rst
> > @@ -68,7 +68,7 @@ as follows.
> >  connection is established, `TSO` is thus advertised to the guest as an
> >  available feature:
> >
> > -QEMU Command Line Parameter::
> > +1. QEMU Command Line Parameter::
>
> If you want to fix the numbered list, we should also fix the second point.
> It is broken in two lines incorrectly (indentation is wrond), so it's not
> counted as part of the numbered list, while this one will.

Mm, not easy to notice on the generated documentation.
I'll fix it too.


>
> >
> >      $ sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
> >      ...
> > @@ -83,6 +83,34 @@ used to enable same::
> >      $ ethtool -K eth0 tso on
> >      $ ethtool -k eth0
> >
> > +**Note:** Enabling this feature impacts the virtio features exposed by the 
> > DPDK
> > +vHost User backend to a guest. If a guest was already connected to OvS 
> > before
> > +enabling TSO and restarting OvS, this guest ports won't have TSO 
> > available::
> > +
> > +    $ ovs-vsctl get interface vhost0 status | grep -o '[^ ]*_offload=[^ 
> > ]*"'
> > +    tx_ip_csum_offload="true"
> > +    tx_sctp_csum_offload="true"
> > +    tx_tcp_csum_offload="true"
> > +    tx_tcp_seg_offload="false"
> > +    tx_udp_csum_offload="true"
> > +
> > +To help diagnose the issue, those ports have some additional information in
> > +their status field in ovsdb::
> > +
> > +    $ ovs-vsctl get interface vhost0 status | grep -o 'disabled_tso=[^ ]*"'
> > +    disabled_tso="true"
> > +
> > +To restore TSO for this guest ports, this guest QEMU process must be 
> > stopped,
> > +then started again. OvS will then report::
> > +
> > +    $ ovs-vsctl get interface vhost0 status | grep -o '[^ ]*_offload=[^ 
> > ]*"'
> > +    tx_ip_csum_offload="true"
> > +    tx_sctp_csum_offload="true"
> > +    tx_tcp_csum_offload="true"
> > +    tx_tcp_seg_offload="true"
> > +    tx_udp_csum_offload="true"
> > +    $ ovs-vsctl get interface vhost0 status | grep -o 'disabled_tso=[^ ]*"'
> > +
> >  ~~~~~~~~~~~
> >  Limitations
> >  ~~~~~~~~~~~
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index c6b41e8365..2dbb8b42e3 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -473,6 +473,15 @@ struct netdev_dpdk {
> >          /* True if vHost device is 'up' and has been reconfigured at least 
> > once */
> >          bool vhost_reconfigured;
> >
> > +        /* Set on driver start (which means after a vHost connection is
> > +         * accepted), and cleared when the vHost device gets configured. */
> > +        bool vhost_initial_config;
> > +
> > +        /* Set on disconnection if an initial configuration did not finish.
> > +         * This triggers a workaround for Virtio features negotiation, that
> > +         * makes TSO unavailable. */
> > +        bool vhost_workaround_enable_ecn;
> > +
> >          atomic_uint8_t vhost_tx_retries_max;
> >          /* 2 pad bytes here. */
>
> These bytes are consumed now.

Argh, I had made a mental note to fix this.


>
> >      );
> > @@ -1359,6 +1368,7 @@ common_construct(struct netdev *netdev, dpdk_port_t 
> > port_no,
> >      dev->requested_lsc_interrupt_mode = 0;
> >      ovsrcu_index_init(&dev->vid, -1);
> >      dev->vhost_reconfigured = false;
> > +    dev->vhost_initial_config = false;
> >      dev->attached = false;
> >      dev->started = false;
> >      dev->reset_needed = false;
> > @@ -3883,6 +3893,10 @@ netdev_dpdk_vhost_user_get_status(const struct 
> > netdev *netdev,
> >                          xasprintf("%d", vring.size));
> >      }
> >
> > +    if (dev->vhost_workaround_enable_ecn) {
> > +        smap_add_format(args, "disabled_tso", "true");
>
> I'm not sure if it's a good name, but I'm not sure how the good one
> should look like either.  Maybe "userspace-tso": "disabled" ?
> We can put any string as a value.

Either we go with your proposal, or we could add a mention in the
documentation to check if the virtio features ECN bit is present.
Something like:
# if [ $(($(ovs-vsctl get interface vhost0 status:features | sed
's/"//g') & (1 << 13))) != 0 ]; then
       echo KO;
   else
       echo ok;
   fi

This is not user friendly, but I don't think we really want users to
start relying on a specific string for this ugly situation.
WDYT?


>
> > +    }
> > +
> >      ovs_mutex_unlock(&dev->mutex);
> >      return 0;
> >  }
> > @@ -4214,6 +4228,12 @@ netdev_dpdk_remap_txqs(struct netdev_dpdk *dev)
> >      free(enabled_queues);
> >  }
> >
> > +static bool
> > +netdev_dpdk_vhost_tso_enabled(struct netdev_dpdk *dev)
> > +{
> > +    return userspace_tso_enabled() && !dev->vhost_workaround_enable_ecn;
> > +}
> > +
> >  /*
> >   * A new virtio-net device is added to a vhost port.
> >   */
> > @@ -4256,6 +4276,7 @@ new_device(int vid)
> >              } else {
> >                  /* Reconfiguration not required. */
> >                  dev->vhost_reconfigured = true;
> > +                dev->vhost_initial_config = false;
>
> So, we're in the new_device() call.  That means the guest successfully
> negotiated features.  Shouldn't we drop vhost_initial_config here
> unconditionally?  Why we need to wait for reconfiguration?  Or does
> QEMU disconnect after the new_device() ?

Indeed, QEMU does not disconnect because of features once we reach the
new_device stage.
So it seems ok.


>
> >              }
> >
> >              if (rte_vhost_get_negotiated_features(vid, &features)) {
> > @@ -4268,7 +4289,7 @@ new_device(int vid)
> >                      dev->hw_ol_features |= NETDEV_TX_SCTP_CKSUM_OFFLOAD;
> >                  }
> >
> > -                if (userspace_tso_enabled()) {
> > +                if (netdev_dpdk_vhost_tso_enabled(dev)) {
> >                      if (features & (1ULL << VIRTIO_NET_F_GUEST_TSO4)
> >                          && features & (1ULL << VIRTIO_NET_F_GUEST_TSO6)) {
> >
> > @@ -4501,6 +4522,21 @@ vring_state_changed(int vid, uint16_t queue_id, int 
> > enable)
> >      return 0;
> >  }
> >
> > +/* Requires that the 'path' socket had been registered before. */
> > +static bool
> > +dpdk_vhost_has_ecn_feature(const char *path)
> > +{
> > +    /* In the past we had this feature enabled, report it by default. */
> > +    bool has_ecn_feature = true;
> > +    uint64_t driver_features;
> > +
> > +    if (rte_vhost_driver_get_features(path, &driver_features) == 0) {
> > +        has_ecn_feature = driver_features & (1ULL << 
> > VIRTIO_NET_F_HOST_ECN);
> > +    }
> > +
> > +    return has_ecn_feature;
> > +}
> > +
> >  static void
> >  destroy_connection(int vid)
> >  {
> > @@ -4515,6 +4551,7 @@ destroy_connection(int vid)
> >          ovs_mutex_lock(&dev->mutex);
> >          if (nullable_string_is_equal(ifname, dev->vhost_id)) {
> >              uint32_t qp_num = NR_QUEUE;
> > +            bool failed_without_ecn;
> >
> >              if (netdev_dpdk_get_vid(dev) >= 0) {
> >                  VLOG_ERR("Connection on socket '%s' destroyed while vhost "
> > @@ -4528,6 +4565,26 @@ destroy_connection(int vid)
> >                  dev->requested_n_txq = qp_num;
> >                  netdev_request_reconfigure(&dev->up);
> >              }
> > +
> > +            if (dev->vhost_initial_config) {
> > +                VLOG_WARN_RL(&rl, "Connection on socket '%s' closed during 
> > "
> > +                             "initialization.", dev->vhost_id);
> > +            }
> > +
> > +            failed_without_ecn = dev->vhost_initial_config
> > +                && !dpdk_vhost_has_ecn_feature(dev->vhost_id);
> > +            if (failed_without_ecn != dev->vhost_workaround_enable_ecn) {
> > +                /* Either an early disconnection was detected (and we can 
> > try
> > +                 * to re-enable ECN for a next connection) or the 
> > disconnection
> > +                 * does not seem incorrect (and we can try to disable ECN 
> > for a
> > +                 * next connection).
> > +                 *
> > +                 * In both cases, this vhost socket must be reconfigured
> > +                 * (see netdev_dpdk_vhost_client_reconfigure()). */
> > +                dev->vhost_workaround_enable_ecn = failed_without_ecn;
> > +                netdev_request_reconfigure(&dev->up);
> > +            }
>
> If we reset vhost_initial_config unconditionally in new_device(),
> vhost_initial_config == true here will mean that the guest, likely, failed
> feature negotiation.  We can't be sure for 100%, but we can't be sure with
> the current code either.  So, we could enable workaround in this case.
> If vhost_initial_config == false, we already had a successful connection
> sometime in the past, so we should not try the workaround.
>
> Does that make sense?

It seems ok but I need to retest.


>
> > +
> >              ovs_mutex_unlock(&dev->mutex);
> >              exists = true;
> >              break;
> > @@ -5431,6 +5488,7 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
> >
> >          if (dev->vhost_reconfigured == false) {
> >              dev->vhost_reconfigured = true;
> > +            dev->vhost_initial_config = false;
>
> This will not be needed if we clear in the new_device.

Ok.


>
> >              /* Carrier status may need updating. */
> >              netdev_change_seq_changed(&dev->up);
> >          }
> > @@ -5458,10 +5516,30 @@ static int
> >  netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
> >  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > +    bool unregister = false;
> > +    bool enable_tso;
> > +    char *vhost_id;
> >      int err;
> >
> >      ovs_mutex_lock(&dev->mutex);
> >
> > +    enable_tso = netdev_dpdk_vhost_tso_enabled(dev);
> > +    if (dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT && dev->vhost_id
> > +        && enable_tso != !dpdk_vhost_has_ecn_feature(dev->vhost_id)) {
>
> In context of previous comments, the check should be;
>
>            && dev->vhost_initial_config && dev->vhost_workaround_enable_ecn
>
> ?

For a "registered" vhost-user port, we have two transitions to handle:
enabling the workaround (for a "faulty" guest after a first try
withouth ECN) and disabling it (after a "faulty" guest, which had
successfully configured, is stopped and OVS prepares for the next
connection).
Checking dev->vhost_initial_config here would prevent from handling
the second case.


>
> We might as well rename it to 'vhost_has_ever_connected' or something
> and inverse the value.
>
> > +
> > +        dev->vhost_driver_flags &= ~RTE_VHOST_USER_CLIENT;
> > +        vhost_id = dev->vhost_id;
> > +        unregister = true;
> > +    }
> > +
> > +    ovs_mutex_unlock(&dev->mutex);
> > +
> > +    if (unregister) {
> > +        dpdk_vhost_driver_unregister(dev, vhost_id);
> > +    }
> > +
> > +    ovs_mutex_lock(&dev->mutex);
> > +
> >      /* Configure vHost client mode if requested and if the following 
> > criteria
> >       * are met:
> >       *  1. Device hasn't been registered yet.
> > @@ -5491,7 +5569,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev 
> > *netdev)
> >          }
> >
> >          /* Enable External Buffers if TCP Segmentation Offload is enabled. 
> > */
> > -        if (userspace_tso_enabled()) {
> > +        if (enable_tso) {
> >              vhost_flags |= RTE_VHOST_USER_EXTBUF_SUPPORT;
> >          }
> >
> > @@ -5516,7 +5594,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev 
> > *netdev)
> >              goto unlock;
> >          }
> >
> > -        if (userspace_tso_enabled()) {
> > +        if (enable_tso) {
> >              virtio_unsup_features = 1ULL << VIRTIO_NET_F_HOST_ECN
> >                                      | 1ULL << VIRTIO_NET_F_HOST_UFO;
> >              VLOG_DBG("%s: TSO enabled on vhost port",
> > @@ -5539,6 +5617,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev 
> > *netdev)
> >              goto unlock;
> >          }
> >
> > +        dev->vhost_initial_config = true;
> >          err = rte_vhost_driver_start(dev->vhost_id);
> >          if (err) {
> >              VLOG_ERR("rte_vhost_driver_start failed for vhost user "
>


-- 
David Marchand

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to