On 6/26/23 19:46, David Marchand wrote:
> On Mon, Jun 26, 2023 at 4:05 PM David Marchand
> <david.march...@redhat.com> wrote:
>>
>> On Mon, Jun 26, 2023 at 3:19 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>>>> - if the UFO feature is "restored" in the master branch, OVS can't
>>>> expose CSUM if the guest negotiated UFO.
>>>
>>> Yep.  IIUC, that will require un-doing/re-wrking some of the changes Mike
>>> did in order to restore ability to disable advertising of checksum offload.
>>
>> We need to disable VIRTIO_NET_F_CSUM (as it was before Mike series)
>> and make sure ECN and UFO are enabled.
>> This is the common point for all versions of OVS until now.
> 
> - I have been testing a fix, and unfortunately, I can't re-enable UFO
> through the vhost API.
> Once disabled in rte_vhost_driver_register(), a feature is removed
> from the supported features set.
> rte_vhost_driver_disable_features() does not care about this supported
> features set, but rte_vhost_driver_enable_features() does.
> 
> This means that the issue affecting upgrades from 2.11 (because of
> UFO) can't be fixed from OVS side.
> Maybe it could be "fixed" with a change on DPDK side, but I am not
> sure it is worth the risk to preserve a workaround.
> 
> 
> - Now, pragmatically, this issue on the UFO feature would result in
> losing connectivity on a vhost-user port.
> An affected guest would not get a chance to see the newly added
> VIRTIO_NET_F_CSUM feature.

Yeah, this one is toast, I guess...

> 
> If we can't "fix" UFO advertisement to preserve this workaround on OVS
> side, there may be nothing to do after Mike series in the end.
> And my followup patch could be kept as proposed in its v4 form.
> 
> WDYT?

Might be fine, but I was thinking about upgrades and this fallback
mechanism is not going to work for live migration, IIUC.

In a live migration case, QEMU will start and negotiate features with
OVS before the migration can start.  After that the virtio device
state will be sent from the source and QEMU will try to load it.
virtio_load will fail due to feature difference.  QEMU will not try to
re-connect in this case.  It will exit with a fault instead.  OVS will
not have a chance to re-negotiate, hence no workaround is actually
possible.  ECN should be in the feature list from the beginning.

Am I missing something?

IIUC, that means that in order to enable TSO by default users will have
to explicitly disable it before upgrade for all existing VMs that
do not have it enabled.  Since the knob is global, that potentially
means disabling userspace-tso forever and not have it available for
any new VMs as well, unless a brand new host is added to the cluster
that will never run pre-existing VMs.

And, for example, upgrades in OpenStack clusters are actually just
migration from + host upgrade + migration back.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to