Hi everyone,

There is an issue raised in ovn-k8s community [0] due to a regression
introduced by OVN commit [1]. The commit breaks the MEG (Multiple External
Gateways) feature. The feature requires that src-route is preferred over
dst-route.

It was considered risky to change the default behavior with commit [1], and
so it had been hanging around for several months asking for feedback.
Unfortunately this didn't get attention until now.

The old behavior of OVN (before commit [1]): longest prefix length route is
preferred, regardless of src or dst. If the prefix length is the same,
prefer dst over src route. This was essentially wrong because src and dst
IP are different fields and it is not reasonable to compare the prefix
length between them. This old behavior leads to unreasonable behavior for
ovn-k8s central mode cluster router east-west traffic when there are
different node-subnet prefix lengths across nodes. The MEG feature of
ovn-k8s happened to work because the src routes added for that feature were
all /32. What the feature really requires is to prefer src routes over dst
routes. After commit [1], the unpredictable east-west traffic routing
problem in the central mode cluster router is resolved, but the MEG feature
is broken.

Now the problem is, ovn-k8s' central mode cluster router requires dst
routes over src routes, while the MEG feature requires src routes over dst
routes. For ovn-k8s IC mode, the cluster router src routes are not required
any more, so they can be removed, and the central mode is going to be
deprecated anyway. So ovn-k8s would prefer src-over-dst.

At the same time, we are releasing OVN 25.09 tomorrow. Based on the above
information, we have below options:

Option 1: revert the commit [1] before the 25.09 release. This is the
easiest, but IMHO it is not the right thing to do since we will go back to
the *wrong* behavior and continue encouraging the bad design of CMS.

Option 2: keep the current behavior of OVN as default behavior in 25.09
release. We can add an option to let users change the behavior so that src
route is preferred over dst route. In particular for ovn-k8s, it can be
configured to src-over-dst so that the MEG feature can be fixed, but it
should only be used for IC mode (and at the same time remove the src routes
for IC mode). For central mode probably there is no user for the MEG
feature and central mode will be deprecated, so we assume it is not a
problem to keep the dst-over-src behavior. This option can be added after
the 25.09 release as a bug fix (backport to 25.09).

Option 3: like option 2, making the behavior configurable, but default to
src-over-dst. The problems of this option are:
- it would immediately break ovn-k8s central mode east-west traffic because
the central mode relies on src-routes having lower priority.
- we'd better make this change before the release, which is a little risky
and may delay the release. Otherwise, we would end up with changing the
default behavior again after the release.

I personally prefer option 2.
Thanks folks for the discussion. Opinions are welcome!

Best,
Han

[0] https://cloud-native.slack.com/archives/C08452HR8V6/p1756994862428589
[1]
https://github.com/ovn-org/ovn/commit/27cc274e66acd9e0ed13525f9ea2597804107348
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to