On Fri, Mar 6, 2026 at 10:37 AM Rukomoinikova Aleksandra <[email protected]> wrote: > > On 06.03.2026 18:21, Mark Michelson wrote: > > On Fri, Mar 6, 2026 at 4:51 AM Ales Musil <[email protected]> wrote: > >> > >> > >> On Thu, Mar 5, 2026 at 6:44 PM Mark Michelson via dev > >> <[email protected]> wrote: > >>> Hi Alexandra, thanks for the patch! > >>> > >>> I have a couple of small notes below. They're so minor, that I think > >>> that a maintainer can fix them when merging, so there is no need for a > >>> v4. > >>> > >>> Acked-by: Mark Michelson <[email protected]> > >>> > >>> On Mon, Mar 2, 2026 at 7:35 AM Alexandra Rukomoinikova > >>> <[email protected]> wrote: > >>>> The learned routes hash previously only used route parameters (prefix, > >>>> nexthop, origin, route_table). This caused collisions when identical > >>>> routes were learned through different transit switches, leading to > >>>> infinite route re-learning. > >>>> > >>>> While such configurations are invalid due to overlapping IP addressing > >>>> - ICshould handle them without looping. > >>>> > >>>> Fix by including the IC route UUID in the hash to distinguish routes > >>>> with identical parameters but different origins. As a result, routers > >>>> will now learn the correct number of prefixes as ECMP routes, making > >>>> the misconfiguration visible to the user. > >>>> > >>>> Signed-off-by: Alexandra Rukomoinikova <[email protected]> > >>>> --- > >>>> v1 --> v2: fixed typo in the tests > >>>> v2 --> v3: fixed hash calculation > >>>> --- > >>>> ic/ovn-ic.c | 56 ++++++++++++++-------------- > >>>> tests/ovn-ic.at | 98 +++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> 2 files changed, 127 insertions(+), 27 deletions(-) > >>>> > >>>> diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c > >>>> index 95d73cb4b..7d2f9442a 100644 > >>>> --- a/ic/ovn-ic.c > >>>> +++ b/ic/ovn-ic.c > >>>> @@ -1286,6 +1286,7 @@ struct ic_route_info { > >>>> const char *origin; > >>>> const char *route_table; > >>>> const char *route_tag; > >>>> + struct uuid ic_route_uuid; > >>>> > >>>> const struct nbrec_logical_router *nb_lr; > >>>> > >>>> @@ -1305,9 +1306,11 @@ struct ic_route_info { > >>>> static uint32_t > >>>> ic_route_hash(const struct in6_addr *prefix, unsigned int plen, > >>>> const struct in6_addr *nexthop, const char *origin, > >>>> - const char *route_table) > >>>> + const char *route_table, const struct uuid *ic_route_uuid) > >>>> { > >>>> - uint32_t basis = hash_bytes(prefix, sizeof *prefix, (uint32_t)plen); > >>>> + uint32_t basis = ic_route_uuid ? uuid_hash(ic_route_uuid) : 0; > >>>> + basis = hash_bytes(prefix, sizeof *prefix, basis); > >>>> + basis = hash_int((uint32_t) plen, basis); > >>>> basis = hash_string(origin, basis); > >>>> basis = hash_string(route_table, basis); > >>>> return hash_bytes(nexthop, sizeof *nexthop, basis); > >>>> @@ -1316,18 +1319,22 @@ ic_route_hash(const struct in6_addr *prefix, > >>>> unsigned int plen, > >>>> static struct ic_route_info * > >>>> ic_route_find(struct hmap *routes, const struct in6_addr *prefix, > >>>> unsigned int plen, const struct in6_addr *nexthop, > >>>> - const char *origin, const char *route_table, uint32_t > >>>> hash) > >>>> + const char *origin, const char *route_table, > >>>> + const struct uuid *ic_route_uuid, uint32_t hash) > >>>> { > >>>> struct ic_route_info *r; > >>>> if (!hash) { > >>>> - hash = ic_route_hash(prefix, plen, nexthop, origin, > >>>> route_table); > >>>> + hash = ic_route_hash(prefix, plen, nexthop, origin, route_table, > >>>> + ic_route_uuid); > >>>> } > >>>> HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) { > >>>> if (ipv6_addr_equals(&r->prefix, prefix) && > >>>> r->plen == plen && > >>>> ipv6_addr_equals(&r->nexthop, nexthop) && > >>>> !strcmp(r->origin, origin) && > >>>> - !strcmp(r->route_table ? r->route_table : "", route_table)) > >>>> { > >>>> + !strcmp(r->route_table ? r->route_table : "", route_table) > >>>> && > >>>> + (!ic_route_uuid || uuid_equals(&r->ic_route_uuid, > >>>> + ic_route_uuid))) { > >>>> return r; > >>>> } > >>>> } > >>>> @@ -1370,7 +1377,8 @@ parse_route(const char *s_prefix, const char > >>>> *s_nexthop, > >>>> static bool > >>>> add_to_routes_learned(struct hmap *routes_learned, > >>>> const struct nbrec_logical_router_static_route > >>>> *nb_route, > >>>> - const struct nbrec_logical_router *nb_lr) > >>>> + const struct nbrec_logical_router *nb_lr, > >>>> + const struct uuid *ic_route_uuid) > >>>> { > >>>> struct in6_addr prefix, nexthop; > >>>> unsigned int plen; > >>>> @@ -1379,8 +1387,11 @@ add_to_routes_learned(struct hmap *routes_learned, > >>>> return false; > >>>> } > >>>> const char *origin = smap_get_def(&nb_route->options, "origin", > >>>> ""); > >>>> + > >>>> + uint32_t hash = ic_route_hash(&prefix, plen, &nexthop, origin, > >>>> + nb_route->route_table, ic_route_uuid); > >>>> if (ic_route_find(routes_learned, &prefix, plen, &nexthop, origin, > >>>> - nb_route->route_table, 0)) { > >>>> + nb_route->route_table, ic_route_uuid, hash)) { > >>>> /* Route was added to learned on previous iteration. */ > >>>> return true; > >>>> } > >>>> @@ -1393,9 +1404,9 @@ add_to_routes_learned(struct hmap *routes_learned, > >>>> ic_route->origin = origin; > >>>> ic_route->route_table = nb_route->route_table; > >>>> ic_route->nb_lr = nb_lr; > >>>> - hmap_insert(routes_learned, &ic_route->node, > >>>> - ic_route_hash(&prefix, plen, &nexthop, origin, > >>>> - nb_route->route_table)); > >>>> + memcpy(&ic_route->ic_route_uuid, ic_route_uuid, sizeof(struct > >>>> uuid)); > >>> There's no need for memcpy here. You can just do a struct assignment: > >>> > >>> ic_route->ic_route_uuid = *ic_route_uuid; > >>> > >>>> + hmap_insert(routes_learned, &ic_route->node, hash); > >>>> + > >>>> return true; > >>>> } > >>>> > >>>> @@ -1562,10 +1573,11 @@ add_to_routes_ad(struct hmap *routes_ad, const > >>>> struct in6_addr prefix, > >>>> route_table = ""; > >>>> } > >>>> > >>>> - uint hash = ic_route_hash(&prefix, plen, &nexthop, origin, > >>>> route_table); > >>>> + uint hash = ic_route_hash(&prefix, plen, &nexthop, origin, > >>>> + route_table, NULL); > >>>> > >>>> if (!ic_route_find(routes_ad, &prefix, plen, &nexthop, origin, > >>>> route_table, > >>>> - hash)) { > >>>> + NULL, hash)) { > >>>> struct ic_route_info *ic_route = xzalloc(sizeof *ic_route); > >>>> ic_route->prefix = prefix; > >>>> ic_route->plen = plen; > >>>> @@ -2154,20 +2166,9 @@ sync_learned_routes(struct ic_context *ctx, > >>>> struct ic_route_info *route_learned > >>>> = ic_route_find(&ic_lr->routes_learned, &prefix, plen, > >>>> &nexthop, isb_route->origin, > >>>> - isb_route->route_table, 0); > >>>> + isb_route->route_table, > >>>> + &isb_route->header_.uuid, 0); > >>>> if (route_learned) { > >>>> - /* Sync external-ids */ > >>>> - struct uuid ext_id; > >>>> - smap_get_uuid(&route_learned->nb_route->external_ids, > >>>> - "ic-learned-route", &ext_id); > >>>> - if (!uuid_equals(&ext_id, &isb_route->header_.uuid)) { > >>>> - char *uuid_s = > >>>> - xasprintf(UUID_FMT, > >>>> - UUID_ARGS(&isb_route->header_.uuid)); > >>>> - > >>>> nbrec_logical_router_static_route_update_external_ids_setkey( > >>>> - route_learned->nb_route, "ic-learned-route", > >>>> uuid_s); > >>>> - free(uuid_s); > >>>> - } > >>>> hmap_remove(&ic_lr->routes_learned, > >>>> &route_learned->node); > >>>> free(route_learned); > >>>> } else { > >>>> @@ -2275,7 +2276,7 @@ advertise_routes(struct ic_context *ctx, > >>>> } > >>>> struct ic_route_info *route_adv = > >>>> ic_route_find(routes_ad, &prefix, plen, &nexthop, > >>>> - isb_route->origin, isb_route->route_table, 0); > >>>> + isb_route->origin, isb_route->route_table, > >>>> NULL, 0); > >>>> if (!route_adv) { > >>>> /* Delete the extra route from IC-SB. */ > >>>> VLOG_DBG("Delete route %s -> %s from IC-SB, which is not > >>>> found" > >>>> @@ -2349,7 +2350,8 @@ build_ts_routes_to_adv(struct ic_context *ctx, > >>>> if (smap_get_uuid(&nb_route->external_ids, "ic-learned-route", > >>>> &isb_uuid)) { > >>>> /* It is a learned route */ > >>>> - if (!add_to_routes_learned(&ic_lr->routes_learned, > >>>> nb_route, lr)) { > >>>> + if (!add_to_routes_learned(&ic_lr->routes_learned, > >>>> nb_route, lr, > >>>> + &isb_uuid)) { > >>>> static struct vlog_rate_limit rl = > >>>> VLOG_RATE_LIMIT_INIT(5, 1); > >>>> VLOG_WARN_RL(&rl, "Bad format of learned route in NB: " > >>>> "%s -> %s. Delete it.", > >>>> nb_route->ip_prefix, > >>>> diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at > >>>> index 1a826aa1c..c633cdeb4 100644 > >>>> --- a/tests/ovn-ic.at > >>>> +++ b/tests/ovn-ic.at > >>>> @@ -4594,3 +4594,101 @@ OVN_CLEANUP_IC([az1], [az2]) > >>>> > >>>> AT_CLEANUP > >>>> ]) > >>>> + > >>>> +OVN_FOR_EACH_NORTHD([ > >>>> +AT_SETUP([ovn-ic -- check loop with annonce duplicated prefixes - > >>>> multiple ts]) > >>> s/annonce/announce/ > >>> > >>> The same mistake happens a few other times in this test as well. > >>> > >>> > >>> > >>>> + > >>>> +ovn_init_ic_db > >>>> +ovn-ic-nbctl ts-add ts11 > >>>> +ovn-ic-nbctl ts-add ts12 > >>>> + > >>>> +for i in 1 2; do > >>>> + ovn_start az$i > >>>> + ovn_as az$i > >>>> + > >>>> + # Enable route learning at AZ level > >>>> + check ovn-nbctl set nb_global . options:ic-route-learn=true > >>>> + # Enable route advertising at AZ level > >>>> + check ovn-nbctl set nb_global . options:ic-route-adv=true > >>>> +done > >>>> + > >>>> +for i in 1 2; do > >>>> + ovn_as az$i > >>>> + > >>>> + check ovn-nbctl lr-add lr$i > >>>> + > >>>> + check ovn-nbctl lrp-add lr$i lr-lrp-$i aa:aa:aa:aa:a1:0$i > >>>> 169.254.101.1/24 > >>>> + check ovn-nbctl lsp-add-router-port ts11 lsp-az$i lr-lrp-$i > >>>> + > >>>> + # Create subnet for annonce. > >>>> + check ovn-nbctl lrp-add lr$i lr-lrp-subnet-$i aa:aa:aa:aa:a2:0$i > >>>> 192.168.0.$i/24 > >>>> + > >>>> + # Create one more lrp-ts port through another transit switch. > >>>> + check ovn-nbctl lrp-add lr$i lr-lrp-dup$i aa:aa:aa:aa:a3:01 > >>>> 169.254.101.1/24 > >>>> + check ovn-nbctl lsp-add-router-port ts12 lsp-dup-az$i lr-lrp-dup$i > >>>> +done > >>>> + > >>>> +OVS_WAIT_FOR_OUTPUT([ovn_as az1 ovn-nbctl lr-route-list lr1 | grep > >>>> 192.168 | > >>>> + grep learned | awk '{print $1, $2}' | sort ], [0], > >>>> [dnl > >>>> +192.168.0.0/24 169.254.101.1 > >>>> +192.168.0.0/24 169.254.101.1 > >>>> +]) > >>>> + > >>>> +OVN_CLEANUP_IC([az1], [az2]) > >>>> +AT_CLEANUP > >>>> +]) > >>>> + > >>>> +OVN_FOR_EACH_NORTHD([ > >>>> +AT_SETUP([ovn-ic -- check loop with annonce duplicated prefixes - > >>>> single ts]) > >>>> + > >>>> +ovn_init_ic_db > >>>> + > >>>> +for i in 1 2 3; do > >>>> + ovn_start az$i > >>>> + ovn_as az$i > >>>> + > >>>> + # Enable route learning at AZ level > >>>> + check ovn-nbctl set nb_global . options:ic-route-learn=true > >>>> + # Enable route advertising at AZ level > >>>> + check ovn-nbctl set nb_global . options:ic-route-adv=true > >>>> +done > >>>> + > >>>> +# Create new transit switches and LRs. Test topology is next: > >>>> +# logical router (lr12) > >>>> +# | > >>>> +# logical router (lr11) - / transit switch (ts11) \ - logical router > >>>> (lr13) > >>>> +# > >>>> + > >>>> +# Create lr11, lr13, ts11 and connect them > >>>> +for i in 1 3; do > >>>> + ovn_as az$i > >>>> + > >>>> + check ovn-nbctl lr-add lr1$i > >>>> + check ovn-ic-nbctl --wait=sb --may-exist ts-add ts11 > >>>> + > >>>> + # Create LRP and connect to TS > >>>> + check ovn-nbctl lrp-add lr1$i lrp-lr1$i-ts11 aa:aa:aa:aa:a1:0$i > >>>> 169.254.101.1/24 > >>>> + check ovn-nbctl lsp-add-router-port ts11 lsp-ts11-lr1$i > >>>> lrp-lr1$i-ts11 > >>>> +done > >>>> + > >>>> +# Create lr12 and connect it to ts11 > >>>> +ovn_as az2 > >>>> +check ovn-nbctl lr-add lr12 > >>>> + > >>>> +# Create LRP and connect to TS > >>>> +check ovn-nbctl lrp-add lr12 lrp-lr12-ts11 aa:aa:aa:aa:a1:03 > >>>> 169.254.101.2/24 > >>>> +check ovn-nbctl lsp-add-router-port ts11 lsp-lr12-ts11 lrp-lr12-ts11 > >>>> + > >>>> +# Create directly-connected route in lr11 > >>>> +check ovn_as az1 ovn-nbctl lrp-add lr11 lrp-lr11 aa:aa:aa:aa:bb:01 > >>>> "192.168.0.1/24" > >>>> +check ovn_as az3 ovn-nbctl lrp-add lr13 lrp-lr13 aa:aa:aa:aa:bb:03 > >>>> "192.168.0.1/24" > >>>> + > >>>> +OVS_WAIT_FOR_OUTPUT([ovn_as az2 ovn-nbctl lr-route-list lr12 | grep > >>>> 192.168 | > >>>> + grep learned | awk '{print $1, $2}' | sort ], [0], > >>>> [dnl > >>>> +192.168.0.0/24 169.254.101.1 > >>>> +192.168.0.0/24 169.254.101.1 > >>>> +]) > >>>> + > >>>> +OVN_CLEANUP_IC([az1], [az2], [az3]) > >>>> +AT_CLEANUP > >>>> +]) > >>>> -- > >>>> 2.48.1 > >>>> > >>>> _______________________________________________ > >>>> dev mailing list > >>>> [email protected] > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>> > >>> _______________________________________________ > >>> dev mailing list > >>> [email protected] > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> Thank you Alexandra and Mark, > >> > >> I'm slightly confused; the commit message says it's a fix, but there isn't > >> a Fixes tag. Also based on the commit message about invalid configuration > >> I don't think this needs a backport, please let me know if that's not the > >> case. I've fixed the nits and applied to main. > >> > >> Regards, > >> Ales > > Hi Ales, > > > > A specific invalid configuration can cause `ovn-nbctl lr-route-list` > > to loop infinitely, listing the same routes forever. If you try to run > > the first new test added in this patch on an unpatched OVN main, then > > the test fails because it continuously lists "192.168.0.0/24 > > 169.254.101.1" in its output. This patch is ensuring that this invalid > > configuration allows `ovn-nbctl lr-route-list` to work as expected. > > > > I tried to find what an appropriate "Fixes" tag would be, but I had > > trouble pinpointing it. The issue of hash collisions for identical > > routes appears to have existed since route advertising was added to > > ovn-ic (57b347c55 ("ovn-ic: Route advertisement.")). I believe this > > patch needs to be backported all the way down to branch-24.03. I > > confirmed that if I apply the first test from this patch to > > branch-24.03 and running it there, I see the same infinite loop > > behavior as I see in unpatched main. > > > Sorry for the noise, I don't think I can read. I thought my patch wasn't > working on 24.03, and after checking that it was working, I realized I > must have misread your email =) > > -- > regards, > Alexandra.
Yes, sorry for the confusion. I was trying to instruct Ales that the patch needs to be backported all the way back to branch-24.03. I was showing that your test reveals that the bug exists in 24.03. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
