On Wed, May 27, 2026 at 8:53 PM Mark Michelson via dev <
[email protected]> wrote:
> In commit 621f85e92437 ("controller: Fix bfd up too early after
> unexpected reboot."), OVN would manipulate the flow-restore-wait flag in
> order to attempt to synchronize with OVS. The previous commit reverted
> this change due to a potential deadlock between OVS and OVN.
>
> If users were using a version of OVN that had commit 621f85e92437, then
> the revert will help to ensure that OVS and OVN won't deadlock any
> longer. However, the revert on its own does not resolve the deadlock if
> the previously running version of ovn-controller set flow-restore-wait
> to true. It requires manual intervention to get flow-restore-wait set to
> false or deleted from the Open_vSwitch table.
>
> This commit seeks to assist by clearing the flow-restore-wait key in the
> Open_vSwitch table. This way, on upgrade, the deadlock is resolved
> without manually setting any OVS database values.
>
> Reported-at: https://redhat.atlassian.net/browse/FDP-3862
> Fixes: 621f85e92437 ("controller: Fix bfd up too early after unexpected
> reboot.")
> Signed-off-by: Mark Michelson <[email protected]>
> ---
>
Hi Mark,
thank you for the patch, I see one slight issue with this.
This will remove any "flow-restore-wait" unconditionally.
However, someone else might set it. The reverted code
actually used "ovn-managed-flow-restore-wait" as an
indication. We could leverage that and remove
both "ovn-managed-flow-restore-wait" and "flow-restore-wait"
only if "ovn-managed-flow-restore-wait=true" WDYT?
controller/ovn-controller.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 7e6c9e69a..51dc394f0 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -7895,6 +7895,20 @@ main(int argc, char *argv[])
> ovsrec_open_vswitch_update_other_config_setkey(
> cfg, "vlan-limit", "0");
> }
> + /* Clear flow-restore-wait. OVN at one point would set
> + * flow-restore-wait in order to try to synchronize with
> + * OVS. However, that resulted in a bug, so that behavior
> + * was reverted. If upgrading from a version where OVN
> + * manipulted flow-restore-wait, then flow-restore-wait
> + * needs to be cleared in order for OVS to function
> + * properly. This is (hopefully) a temporary measure until
> + * a more reliable method of synchronizing with OVS is
> + * devised.
> + */
> + if (smap_get(&cfg->other_config, "flow-restore-wait")) {
> + ovsrec_open_vswitch_update_other_config_delkey(
> + cfg, "flow-restore-wait");
> + }
> }
>
> static bool chassis_idx_stored = false;
> --
> 2.52.0
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Regards,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev