On 9/5/25 5:48 PM, Han Zhou wrote:
> On Fri, Sep 5, 2025 at 2:50 AM Dumitru Ceara <[email protected]> wrote:
>>
>> On 9/5/25 7:23 AM, Han Zhou wrote:
>>> On Thu, Sep 4, 2025 at 10:22 PM Han Zhou <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> On Thu, Sep 4, 2025 at 4:15 PM Dumitru Ceara <[email protected]> wrote:
>>>>>
>>>>> On 9/4/25 8:51 PM, Han Zhou wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I'm adding the rest of the ovn-kubernetes maintainers to the list of
>>>>> recipients of this email too.
>>>>>
>>>>>> There is an issue raised in ovn-k8s community [0] due to a regression
>>>>>> introduced by OVN commit [1]. The commit breaks the MEG (Multiple
>>>>>> External Gateways) feature. The feature requires that src-route is
>>>>>> preferred over dst-route.
>>>>>>
>>>>>> It was considered risky to change the default behavior with commit
>>> [1],
>>>>>> and so it had been hanging around for several months asking for
>>>>>> feedback. Unfortunately this didn't get attention until now.
>>>>>>
>>>>>> The old behavior of OVN (before commit [1]): longest prefix length
>>> route
>>>>>> is preferred, regardless of src or dst. If the prefix length is the
>>>>>> same, prefer dst over src route. This was essentially wrong because
>>> src
>>>>>> and dst IP are different fields and it is not reasonable to compare
>>> the
>>>>>> prefix length between them. This old behavior leads to unreasonable
>>>>>> behavior for ovn-k8s central mode cluster router east-west traffic
>>> when
>>>>>> there are different node-subnet prefix lengths across nodes. The MEG
>>>>>> feature of ovn-k8s happened to work because the src routes added for
>>>>>> that feature were all /32. What the feature really requires is to
>>> prefer
>>>>>> src routes over dst routes. After commit [1], the unpredictable east-
>>>>>> west traffic routing problem in the central mode cluster router is
>>>>>> resolved, but the MEG feature is broken.
>>>>>>
>>>>>> Now the problem is, ovn-k8s' central mode cluster router requires dst
>>>>>> routes over src routes, while the MEG feature requires src routes
> over
>>>>>> dst routes. For ovn-k8s IC mode, the cluster router src routes are
> not
>>>>>> required any more, so they can be removed, and the central mode is
>>> going
>>>>>> to be deprecated anyway. So ovn-k8s would prefer src-over-dst.
>>>>>>
>>>>>> At the same time, we are releasing OVN 25.09 tomorrow. Based on the
>>>>>> above information, we have below options:
>>>>>>
>>>>>> Option 1: revert the commit [1] before the 25.09 release. This is the
>>>>>> easiest, but IMHO it is not the right thing to do since we will go
>>> back
>>>>>> to the *wrong* behavior and continue encouraging the bad design of
>>> CMS.
>>>>>>
>>>>>
>>>>> My vote goes to Option 1 for now.  The only reason I acked [1]
>>>>> originally was because there was _no_ objection from ovn-kubernetes
>>>>> maintainers and we were operating under the impression that _no_
>>>>> ovn-kubernetes features get broken by the change in behavior (see
>>>>> original discussion
>>>>>
>>>
> https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/#3420641
>>> ).
>>>>>
>>>>> That turned out to be a wrong assumption.  In my opinion, we cannot
>>>>> accept a behavior change the knowingly breaks users.  So we should
>>>>> revert [1].
>>>>>
>>>>>> Option 2: keep the current behavior of OVN as default behavior in
>>> 25.09
>>>>>> release. We can add an option to let users change the behavior so
> that
>>>>>> src route is preferred over dst route. In particular for ovn-k8s, it
>>> can
>>>>>> be configured to src-over-dst so that the MEG feature can be fixed,
>>> but
>>>>>> it should only be used for IC mode (and at the same time remove the
>>> src
>>>>>> routes for IC mode). For central mode probably there is no user for
>>> the
>>>>>> MEG feature and central mode will be deprecated, so we assume it is
>>> not
>>>>>> a problem to keep the dst-over-src behavior. This option can be added
>>>>>> after the 25.09 release as a bug fix (backport to 25.09).
>>>>>>
>>>>>
>>>>> However, I don't think I'd oppose backporting a new feature like this.
>>>>> But it would have to be an opt-in feature, not an opt-out as suggested
>>> here.
>>>>>
>>>>> That is, I think we should revert [1] and add an opt-in knob or option
>>>>> to change behavior.  We can backport this knob/option to 25.09.z.
>>>>>
>>>>> We already broke ovn-kubernetes [0] why would we risk breaking other
>>>>> CMSs too?
>>>>>
>>>>>> Option 3: like option 2, making the behavior configurable, but
> default
>>>>>> to src-over-dst. The problems of this option are:
>>>>>> - it would immediately break ovn-k8s central mode east-west traffic
>>>>>> because the central mode relies on src-routes having lower priority.
>>>>>> - we'd better make this change before the release, which is a little
>>>>>> risky and may delay the release. Otherwise, we would end up with
>>>>>> changing the default behavior again after the release.
>>>>>>
>>>>>> I personally prefer option 2.
>>>>>> Thanks folks for the discussion. Opinions are welcome!
>>>>>>
>>>>>> Best,
>>>>>> Han
>>>>>>
>>>>>
>>>>> Regards,
>>>>> Dumitru
>>>>>
>>>>
>>>> Thanks Dumitru for the feedback. I sent a patch to revert the commit
> [1].
>>>> If there are no other opinions, we may merge it and backport to 25.09
>>> before the release.
>>>
>>
>> Hi Han,
>>
>> Ales and Surya shared their opinions [0] [1].  I went ahead and merged
>> the revert.  Would you have time to work on making the routing behavior
>> configurable (default as it is on 25.03 and opt-in)?  As discussed we
>> can backport that to branch-25.09 after the release.
>>
> 
> Thanks for reviewing and merging it! Yes I can work on the configurable
> behavior.

Hi Han,

> Checking all the discussions so far from you, Ilya and Tim, it seems
> keeping current behavior as default and adding an opt-in option for both
> src-over-dst and dst-over-src (and keep the router policy stage untouched)
> are the most practical approach. Any objections?
> 

I also think that's a good approach and that can also be backported to
25.09.  So +1 from me.

But on the long term I think we should consider implementing the
suggestion Ilya had:

https://mail.openvswitch.org/pipermail/ovs-dev/2025-September/426021.html

I think it would align OVN with other well established implementations
potentially avoiding confusion and making it easier to use.  As a set of
separate features of course. :)

Regards,
Dumitru

> Best,
> Han
> 
>> Regards,
>> Dumitru
>>
>> [0]
>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-September/426008.html
>> [1]
>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-September/426012.html
>>
>>> Sorry, forgot the link to the patch:
>>>
> https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/
>>>
>>>>
>>>> Best,
>>>> Han
>>>>
>>>>>> [0] https://cloud-native.slack.com/archives/C08452HR8V6/
>>>>>> p1756994862428589 <
>>> https://cloud-native.slack.com/archives/C08452HR8V6/
>>>>>> p1756994862428589>
>>>>>> [1] https://github.com/ovn-org/ovn/
>>>>>> commit/27cc274e66acd9e0ed13525f9ea2597804107348 <
>>> https://github.com/ovn-
>>>>>> org/ovn/commit/27cc274e66acd9e0ed13525f9ea2597804107348>
>>>
>>>>
>>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to