On 8/4/25 4:32 PM, Felix Huettner wrote:
> On Mon, Aug 04, 2025 at 12:33:28PM +0200, Dumitru Ceara wrote:
>> In 712fca55b3b1 ("controller: Prioritize host routes.") and later in
>> cd4ad2f56179 ("northd: Redistribution of NAT/LB routes.") support was
>> added for advertising routes for objects (logical port / NAT / LB IPs)
>> that are bound to a single chassis (e.g., distributed NAT) with a better
>> metric on the chassis where they're bound. On all other chassis,
>> however, the route was still advertised but only with a worse metric.
>>
>> While this works fine in deployments as described in 712fca55b3b1
>> ("controller: Prioritize host routes."), this behavior actually makes
>> the dynamic routing feature unusable in cases when all inter-node
>> traffic is forwarded through the L3 fabric, e.g. spine-leaf topologies
>> with iBGP between leaves and spines and eBGP between OVN compute nodes
>> and fabric leafs: that's due to the fact that eBGP routes are usually
>> preferred over iBGP routes.
>>
>> Consider the following example:
>> +------+ +------+
>> |Spine1| |Spine2|
>> +------+ +------+
>> | \ / |
>> | \ / |
>> | \ / |
>> | X |
>> | / \ |
>> | / \ |
>> | / \ |
>> +-----+ +----+
>> |Leaf1| |Leaf2|
>> +-----+ +----+
>> | |
>> | |
>> +----------+ +----------+
>> | Chassis1 | | Chassis2 |
>> +----------+ +----------+
>>
>> An OVN distributed NAT, e.g., 42.42.42.42 "bound" to Chassis1 would be
>> advertised with a metric of 100 on the eBGP Chassis1 <-> Leaf1
>> connection and with a metric of 1000 (worse) on the eBGP Chasssi2 <->
>> Leaf2 connection. Leaf2 will also learn an iBGP route (through Spine1
>> and Spine2) for the same prefix (towards Chassis1) but because eBGP
>> administrative distance is better than the iBGP one, Leaf2 will always
>> prefer the metric 1000 route. That means Leaf2 will always forward
>> traffic destined to 42.42.42.42 via Chassis2 which is sub-optimal.
>>
>> The main reason for advertising the (NAT) IP on both chassis was likely
>> to provide redundancy in case traffic hits the OVN cluster on a node
>> that doesn't host the NAT. But with topologies as the one depicted
>> above the redundancy is handled by the fabric.
>>
>> OVN didn't have a way to disable worse metric route advertisements. This
>> commit adds one, through a new logical router / logical router port
>> option, "dynamic-routing-redistribute-local-only" which, if enabled,
>> informs ovn-controller to not advertise routes for chassis bound IPs
>> (Sb.Advertised_Route.tracked_port set) on chassis where the tracked port
>> is not bound. By default this option is disabled. The option is
>> propagated by ovn-northd to the SB.Port_Binding corresponding to the
>> logical router port (or all router ports if configured on the router).
>>
>> Fixes: 712fca55b3b1 ("controller: Prioritize host routes.")
>> Fixes: cd4ad2f56179 ("northd: Redistribution of NAT/LB routes.")
>> Reported-at: https://issues.redhat.com/browse/FDP-1464
>> Tested-by: Jakub Libosvar <[email protected]>
>> Signed-off-by: Dumitru Ceara <[email protected]>
>
> Acked-by: Felix Huettner <[email protected]>
>
Hi Felix, Jakub,
Thanks for the review and testing! Applied to main and 25.03.
Regards,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev