Hi Dumitru,

Regards,
Vladislav Odintsov

> On 12 Dec 2022, at 17:57, Dumitru Ceara <dce...@redhat.com> wrote:
> 
> On 12/12/22 14:39, Vladislav Odintsov wrote:
>> Hi Numan, Dumitru,
>> 
>> I forgot one thing about a possible upgrade recommendation for the users.
>> Fix can be applied much easier: just upgrading all ovn-ic in all AZs first 
>> and then converting DB schema. This will remove all duplicates.
>> The latter can be requested manually invoking systemctl restart ovn-ic-db 
>> (at least for RH, there is a separate systemd unit for IC databases) or 
>> calling ovsdb-client convert … command directly.
>> 
> 
> This is a good point, especially because we don't define anywhere
> (AFAIK) an order of upgrading components when it comes to ovn-ic.
> 
> We should document this in ovn-upgrades.rst.
> 
>> Would it be an acceptable compromise for this situation?
> 
> This has a long term implication though: ovn-ic must always be backwards
> compatible with ovn-ic-SB schema changes.  That is not the case with
> ovn-northd and SB schema.  If people think that's ok, I'm fine with that
> too but we need to make sure we properly document this.
> 

As nobody answers, should I:
1. Split patch #3 by two: fix of duplicates (in order it do be backportable to 
older branches) and the second part is an index changes + upgrade docs, which 
will be a part of a major upgrade and will not be considered for backporting.
2. Describe in the second part, that if user upgrades from any version prior to 
this major release it should upgrade all ovn-ic daemons on all availability 
zones first and then upgrade ovn-ic-sb schema?

If that has no objections, I’ll send a new patchset for a final review.

> Regards,
> Dumitru
> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 12 Dec 2022, at 16:20, Numan Siddique <num...@ovn.org> wrote:
>>> 
>>> On Mon, Dec 12, 2022 at 5:41 AM Dumitru Ceara <dce...@redhat.com> wrote:
>>>> 
>>>> On 12/9/22 19:42, Vladislav Odintsov wrote:
>>>>> Hi Dumitru,
>>>> 
>>>> Hi Vladislav,
>>>> 
>>>>> 
>>>>> please see answers inline.
>>>>> 
>>>>> Regards,
>>>>> Vladislav Odintsov
>>>>> 
>>>>>> On 9 Dec 2022, at 17:37, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>> 
>>>>>> On 12/6/22 11:20, Vladislav Odintsov wrote:
>>>>>>> Signed-off-by: Vladislav Odintsov <odiv...@gmail.com 
>>>>>>> <mailto:odiv...@gmail.com>>
>>>>>>> ---
>>>>>> 
>>>>>> Hi Vladislav,
>>>>>> 
>>>>>>> ic/ovn-ic.c         | 83 +++++++++++++++++++++++++++------------------
>>>>>>> ovn-ic-sb.ovsschema |  6 ++--
>>>>>>> tests/ovn-ic.at     | 60 ++++++++++++++++++++++++++++++++
>>>>>>> 3 files changed, 114 insertions(+), 35 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
>>>>>>> index 9e2369fef..64307d8c4 100644
>>>>>>> --- a/ic/ovn-ic.c
>>>>>>> +++ b/ic/ovn-ic.c
>>>>>>> @@ -884,10 +884,12 @@ ic_route_hash(const struct in6_addr *prefix, 
>>>>>>> unsigned int plen,
>>>>>>> static struct ic_route_info *
>>>>>>> ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
>>>>>>>             unsigned int plen, const struct in6_addr *nexthop,
>>>>>>> -              const char *origin, char *route_table)
>>>>>>> +              const char *origin, const char *route_table, uint32_t 
>>>>>>> hash)
>>>>>>> {
>>>>>>>   struct ic_route_info *r;
>>>>>>> -    uint32_t hash = ic_route_hash(prefix, plen, nexthop, origin, 
>>>>>>> route_table);
>>>>>>> +    if (!hash) {
>>>>>>> +        hash = ic_route_hash(prefix, plen, nexthop, origin, 
>>>>>>> route_table);
>>>>>>> +    }
>>>>>>>   HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
>>>>>>>       if (ipv6_addr_equals(&r->prefix, prefix) &&
>>>>>>>           r->plen == plen &&
>>>>>>> @@ -945,8 +947,8 @@ add_to_routes_learned(struct hmap *routes_learned,
>>>>>>>   }
>>>>>>>   const char *origin = smap_get_def(&nb_route->options, "origin", "");
>>>>>>>   if (ic_route_find(routes_learned, &prefix, plen, &nexthop, origin,
>>>>>>> -                      nb_route->route_table)) {
>>>>>>> -        /* Route is already added to learned in previous iteration. */
>>>>>>> +                      nb_route->route_table, 0)) {
>>>>>>> +        /* Route was added to learned on previous iteration. */
>>>>>>>       return true;
>>>>>>>   }
>>>>>>> 
>>>>>>> @@ -1093,10 +1095,42 @@ route_need_advertise(const char *policy,
>>>>>>> }
>>>>>>> 
>>>>>>> static void
>>>>>>> -add_to_routes_ad(struct hmap *routes_ad,
>>>>>>> -                 const struct nbrec_logical_router_static_route 
>>>>>>> *nb_route,
>>>>>>> -                 const struct lport_addresses *nexthop_addresses,
>>>>>>> -                 const struct smap *nb_options, const char 
>>>>>>> *route_table)
>>>>>>> +add_to_routes_ad(struct hmap *routes_ad, const struct in6_addr prefix,
>>>>>>> +                 unsigned int plen, const struct in6_addr nexthop,
>>>>>>> +                 const char *origin, const char *route_table,
>>>>>>> +                 const struct nbrec_logical_router_port *nb_lrp,
>>>>>>> +                 const struct nbrec_logical_router_static_route 
>>>>>>> *nb_route)
>>>>>>> +{
>>>>>>> +    if (route_table == NULL) {
>>>>>>> +        route_table = "";
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    uint hash = ic_route_hash(&prefix, plen, &nexthop, origin, 
>>>>>>> route_table);
>>>>>>> +
>>>>>>> +    if (!ic_route_find(routes_ad, &prefix, plen, &nexthop, origin, 
>>>>>>> route_table,
>>>>>>> +                       hash)) {
>>>>>>> +        struct ic_route_info *ic_route = xzalloc(sizeof *ic_route);
>>>>>>> +        ic_route->prefix = prefix;
>>>>>>> +        ic_route->plen = plen;
>>>>>>> +        ic_route->nexthop = nexthop;
>>>>>>> +        ic_route->nb_route = nb_route;
>>>>>>> +        ic_route->origin = origin;
>>>>>>> +        ic_route->route_table = route_table;
>>>>>>> +        ic_route->nb_lrp = nb_lrp;
>>>>>>> +        hmap_insert(routes_ad, &ic_route->node, hash);
>>>>>>> +    } else {
>>>>>>> +        VLOG_WARN("Duplicate route advertisement was suppressed! NB 
>>>>>>> route "
>>>>>>> +                  "uuid: "UUID_FMT,
>>>>>>> +                  UUID_ARGS(&nb_route->header_.uuid));
>>>>>> 
>>>>>> I think this should be rate limited.
>>>>> 
>>>>> Agree, will fix this in v3.
>>>>> 
>>>>>> 
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +add_static_to_routes_ad(
>>>>>>> +    struct hmap *routes_ad,
>>>>>>> +    const struct nbrec_logical_router_static_route *nb_route,
>>>>>>> +    const struct lport_addresses *nexthop_addresses,
>>>>>>> +    const struct smap *nb_options, const char *route_table)
>>>>>>> {
>>>>>>>   if (strcmp(route_table, nb_route->route_table)) {
>>>>>>>       if (VLOG_IS_DBG_ENABLED()) {
>>>>>>> @@ -1145,16 +1179,8 @@ add_to_routes_ad(struct hmap *routes_ad,
>>>>>>>       ds_destroy(&msg);
>>>>>>>   }
>>>>>>> 
>>>>>>> -    struct ic_route_info *ic_route = xzalloc(sizeof *ic_route);
>>>>>>> -    ic_route->prefix = prefix;
>>>>>>> -    ic_route->plen = plen;
>>>>>>> -    ic_route->nexthop = nexthop;
>>>>>>> -    ic_route->nb_route = nb_route;
>>>>>>> -    ic_route->origin = ROUTE_ORIGIN_STATIC;
>>>>>>> -    ic_route->route_table = nb_route->route_table;
>>>>>>> -    hmap_insert(routes_ad, &ic_route->node,
>>>>>>> -                ic_route_hash(&prefix, plen, &nexthop, 
>>>>>>> ROUTE_ORIGIN_STATIC,
>>>>>>> -                              nb_route->route_table));
>>>>>>> +    add_to_routes_ad(routes_ad, prefix, plen, nexthop, 
>>>>>>> ROUTE_ORIGIN_STATIC,
>>>>>>> +                     nb_route->route_table, NULL, nb_route);
>>>>>>> }
>>>>>>> 
>>>>>>> static void
>>>>>>> @@ -1198,18 +1224,9 @@ add_network_to_routes_ad(struct hmap *routes_ad, 
>>>>>>> const char *network,
>>>>>>>       ds_destroy(&msg);
>>>>>>>   }
>>>>>>> 
>>>>>>> -    struct ic_route_info *ic_route = xzalloc(sizeof *ic_route);
>>>>>>> -    ic_route->prefix = prefix;
>>>>>>> -    ic_route->plen = plen;
>>>>>>> -    ic_route->nexthop = nexthop;
>>>>>>> -    ic_route->nb_lrp = nb_lrp;
>>>>>>> -    ic_route->origin = ROUTE_ORIGIN_CONNECTED;
>>>>>>> -
>>>>>>>   /* directly-connected routes go to <main> route table */
>>>>>>> -    ic_route->route_table = NULL;
>>>>>>> -    hmap_insert(routes_ad, &ic_route->node,
>>>>>>> -                ic_route_hash(&prefix, plen, &nexthop,
>>>>>>> -                              ROUTE_ORIGIN_CONNECTED, ""));
>>>>>>> +    add_to_routes_ad(routes_ad, prefix, plen, nexthop, 
>>>>>>> ROUTE_ORIGIN_CONNECTED,
>>>>>>> +                     NULL, nb_lrp, NULL);
>>>>>>> }
>>>>>>> 
>>>>>>> static bool
>>>>>>> @@ -1369,7 +1386,7 @@ sync_learned_routes(struct ic_context *ctx,
>>>>>>>           struct ic_route_info *route_learned
>>>>>>>               = ic_route_find(&ic_lr->routes_learned, &prefix, plen,
>>>>>>>                               &nexthop, isb_route->origin,
>>>>>>> -                                isb_route->route_table);
>>>>>>> +                                isb_route->route_table, 0);
>>>>>>>           if (route_learned) {
>>>>>>>               /* Sync external-ids */
>>>>>>>               struct uuid ext_id;
>>>>>>> @@ -1468,7 +1485,7 @@ advertise_routes(struct ic_context *ctx,
>>>>>>>       }
>>>>>>>       struct ic_route_info *route_adv =
>>>>>>>           ic_route_find(routes_ad, &prefix, plen, &nexthop,
>>>>>>> -                          isb_route->origin, isb_route->route_table);
>>>>>>> +                          isb_route->origin, isb_route->route_table, 
>>>>>>> 0);
>>>>>>>       if (!route_adv) {
>>>>>>>           /* Delete the extra route from IC-SB. */
>>>>>>>           VLOG_DBG("Delete route %s -> %s from IC-SB, which is not 
>>>>>>> found"
>>>>>>> @@ -1550,8 +1567,8 @@ build_ts_routes_to_adv(struct ic_context *ctx,
>>>>>>>           }
>>>>>>>       } else {
>>>>>>>           /* It may be a route to be advertised */
>>>>>>> -            add_to_routes_ad(routes_ad, nb_route, ts_port_addrs,
>>>>>>> -                             &nb_global->options, ts_route_table);
>>>>>>> +            add_static_to_routes_ad(routes_ad, nb_route, ts_port_addrs,
>>>>>>> +                                    &nb_global->options, 
>>>>>>> ts_route_table);
>>>>>>>       }
>>>>>>>   }
>>>>>>> 
>>>>>>> diff --git a/ovn-ic-sb.ovsschema b/ovn-ic-sb.ovsschema
>>>>>>> index 72c9d3f3e..1d60b36d1 100644
>>>>>>> --- a/ovn-ic-sb.ovsschema
>>>>>>> +++ b/ovn-ic-sb.ovsschema
>>>>>>> @@ -1,7 +1,7 @@
>>>>>>> {
>>>>>>>   "name": "OVN_IC_Southbound",
>>>>>>> -    "version": "1.1.0",
>>>>>>> -    "cksum": "2309827842 6784",
>>>>>>> +    "version": "1.1.1",
>>>>>>> +    "cksum": "3684563024 6914",
>>>>>>>   "tables": {
>>>>>>>       "IC_SB_Global": {
>>>>>>>           "columns": {
>>>>>>> @@ -101,6 +101,8 @@
>>>>>>>               "external_ids": {
>>>>>>>                   "type": {"key": "string", "value": "string",
>>>>>>>                            "min": 0, "max": "unlimited"}}},
>>>>>>> +            "indexes": [["transit_switch", "availability_zone", 
>>>>>>> "route_table",
>>>>>>> +                         "ip_prefix", "nexthop"]],
>>>>>> 
>>>>>> Unfortunately we can't just add an index.  This will break backwards
>>>>>> compatibility.  I'm afraid we just have to deal with duplicates in this
>>>>>> table (doesn't your patch already warn for that?).  Or maybe we should
>>>>>> find a way to transition (recommend an upgrade path?) to a version where
>>>>>> we can enforce the index.
>>>>> 
>>>>> What did you mean by breaking backward compatibility here?
>>>>> I see that in worst case ovn-ic database restart will fail with the 
>>>>> schema convert attempt in case there are records which violate this index.
>>>>> Can we just notice in NEWS file that user must ensure before the upgrade 
>>>>> that there are no multiple records with same availability zone, transit 
>>>>> switch, route table, ip_prefix and nexthop?
>>>>> 
>>>>> I created multiple same routes in ic sb and tried to apply new schema:
>>>>> # ovsdb-client convert unix://var/run/ovn/ovn_ic_sb_db.sock 
>>>>> /usr/share/ovn/ovn-ic-sb.ovsschema
>>>>> 2022-12-09T18:37:36Z|00001|ovsdb|WARN|/usr/share/ovn/ovn-ic-sb.ovsschema: 
>>>>> changed 2 columns in 'OVN_IC_Southbound' database from ephemeral to 
>>>>> persistent, including 'status' column in 'Connection' table, because 
>>>>> clusters do not support ephemeral columns
>>>>> ovsdb-client: transaction returned error: {"details":"Transaction causes 
>>>>> multiple rows in \"Route\" table to have identical values 
>>>>> (\"668552d5-45ce-4e51-b728-118ee9a103b1\", 
>>>>> be9ecbbd-ffbe-4e40-afd9-8c53fee2b39b, \"1\", \"1.1.1.1/32\", and \"\") 
>>>>> for index on columns \"transit_switch\", \"availability_zone\", 
>>>>> \"route_table\", \"ip_prefix\", and \"nexthop\".  First row, with UUID 
>>>>> a4b905bd-3d26-4d78-804e-e51dd54429ea, was inserted by this transaction.  
>>>>> Second row, with UUID 426d4795-8e1b-432f-9cf1-4a0f10818265, was inserted 
>>>>> by this transaction.","error":"constraint violation"}
>>>>> 
>>>> 
>>>> This is what I meant by "breaking backwards compatibility".
>>>> 
>>>>> Alternatively we can not to add this index for now and add it later 
>>>>> (when?).
>>>> 
>>>> This is what I was suggesting as a "way to transition to a version where
>>>> we can enforce the index".  So have ovn-ic clean up and reconcile the
>>>> DBs properly in the next version.  We could potentially also add a note
>>>> to NEWS saying that an index will be enforced later.
>>>> 
>>>>> But this also doesn’t guarantee that end user will upgrade from affected 
>>>>> version through version with the fix in code but without index change and 
>>>>> next upgrade to the version with applied new index.
>>>>> 
>>>> 
>>>> True, but what if we document in the ovn-upgrades.rst manual that ovn-ic
>>>> users should upgrade from versions pre-v22.09 in two stages: first to
>>>> 22.12.latest and then to whatever newer release they need?
>>>> 
>>>>> And the third option I see is just not to add an index at all. But I 
>>>>> don’t like it though. Having an index is a good approach, which prevents 
>>>>> any possible future (or existent in nowadays) bugs, which lead to 
>>>>> creation of duplicated records somehow.
>>>>> I guess it’s not too hard to check the duplicates - convert command 
>>>>> returns uuid of duplicated records, so a slight upgrade difficulties 
>>>>> doesn’t look to me like a reason for not to use this ability (new index).
>>>> 
>>>> I have the impression that this is not that easy.  Doesn't it require to
>>>> stop ovn-ic daemons while running the cleanup script?  Otherwise the
>>>> dups might be reinserted.
>> 
>>>> 
>>>>> What do you think?
>>>>> 
>>>> 
>>>> I think I might be OK with enforcing the index now too if we properly
>>>> document it.  It also depends on how many users we have out there that
>>>> might suffer from these duplicated routes being advertised (I'm guessing
>>>> not too many).  CC-ing more maintainers to see what others think about 
>>>> this.
>>> 
>>> Would it work if we add a new table instead ? "Route_v2" with indexing
>>> and deprecating the older one ?
>>> 
>>> Thanks
>>> Numan
>>> 
>>>> 
>>>> Regards,
>>>> Dumitru
>>>> 
>>>> _______________________________________________
>>>> dev mailing list
>>>> d...@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
> 
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to