On Fri, Sep 5, 2025 at 1:15 AM Dumitru Ceara <[email protected]> wrote:
> On 9/4/25 8:51 PM, Han Zhou wrote: > > Hi everyone, > > > > Hi, > > I'm adding the rest of the ovn-kubernetes maintainers to the list of > recipients of this email too. > Hi Han and Dumitru, > > > There is an issue raised in ovn-k8s community [0] due to a regression > > introduced by OVN commit [1]. The commit breaks the MEG (Multiple > > External Gateways) feature. The feature requires that src-route is > > preferred over dst-route. > > > > It was considered risky to change the default behavior with commit [1], > > and so it had been hanging around for several months asking for > > feedback. Unfortunately this didn't get attention until now. > > > > The old behavior of OVN (before commit [1]): longest prefix length route > > is preferred, regardless of src or dst. If the prefix length is the > > same, prefer dst over src route. This was essentially wrong because src > > and dst IP are different fields and it is not reasonable to compare the > > prefix length between them. This old behavior leads to unreasonable > > behavior for ovn-k8s central mode cluster router east-west traffic when > > there are different node-subnet prefix lengths across nodes. The MEG > > feature of ovn-k8s happened to work because the src routes added for > > that feature were all /32. What the feature really requires is to prefer > > src routes over dst routes. After commit [1], the unpredictable east- > > west traffic routing problem in the central mode cluster router is > > resolved, but the MEG feature is broken. > > > > Now the problem is, ovn-k8s' central mode cluster router requires dst > > routes over src routes, while the MEG feature requires src routes over > > dst routes. For ovn-k8s IC mode, the cluster router src routes are not > > required any more, so they can be removed, and the central mode is going > > to be deprecated anyway. So ovn-k8s would prefer src-over-dst. > > > > At the same time, we are releasing OVN 25.09 tomorrow. Based on the > > above information, we have below options: > > > > Option 1: revert the commit [1] before the 25.09 release. This is the > > easiest, but IMHO it is not the right thing to do since we will go back > > to the *wrong* behavior and continue encouraging the bad design of CMS. > > > > My vote goes to Option 1 for now. The only reason I acked [1] > originally was because there was _no_ objection from ovn-kubernetes > maintainers and we were operating under the impression that _no_ > ovn-kubernetes features get broken by the change in behavior (see > original discussion > > https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/#3420641 > ). > > That turned out to be a wrong assumption. In my opinion, we cannot > accept a behavior change the knowingly breaks users. So we should > revert [1]. > I agree that the revert is the safest way to proceed. > > > Option 2: keep the current behavior of OVN as default behavior in 25.09 > > release. We can add an option to let users change the behavior so that > > src route is preferred over dst route. In particular for ovn-k8s, it can > > be configured to src-over-dst so that the MEG feature can be fixed, but > > it should only be used for IC mode (and at the same time remove the src > > routes for IC mode). For central mode probably there is no user for the > > MEG feature and central mode will be deprecated, so we assume it is not > > a problem to keep the dst-over-src behavior. This option can be added > > after the 25.09 release as a bug fix (backport to 25.09). > > > > However, I don't think I'd oppose backporting a new feature like this. > But it would have to be an opt-in feature, not an opt-out as suggested > here. > > That is, I think we should revert [1] and add an opt-in knob or option > to change behavior. We can backport this knob/option to 25.09.z. > > We already broke ovn-kubernetes [0] why would we risk breaking other > CMSs too? > Yeah releasing with behavior that we know is breaking some CMS isn't a good idea. I wouldn't be against backport of opt-in we did it multiple times in the past already and it brings the least amount of risk. > > > Option 3: like option 2, making the behavior configurable, but default > > to src-over-dst. The problems of this option are: > > - it would immediately break ovn-k8s central mode east-west traffic > > because the central mode relies on src-routes having lower priority. > > - we'd better make this change before the release, which is a little > > risky and may delay the release. Otherwise, we would end up with > > changing the default behavior again after the release. > > > > I personally prefer option 2. > > Thanks folks for the discussion. Opinions are welcome! > > > > Best, > > Han > > > Discovering this a week earlier would give us a time to make it opt-in in 25.09 without the need for revert, so that's a bit unfortunate timing. > > Regards, > Dumitru > > > [0] https://cloud-native.slack.com/archives/C08452HR8V6/ > > p1756994862428589 <https://cloud-native.slack.com/archives/C08452HR8V6/ > > p1756994862428589> > > [1] https://github.com/ovn-org/ovn/ > > commit/27cc274e66acd9e0ed13525f9ea2597804107348 <https://github.com/ovn- > > org/ovn/commit/27cc274e66acd9e0ed13525f9ea2597804107348> > > Regards, Ales _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
