On 8/28/25 11:55 PM, Ilya Maximets wrote:
> On 8/28/25 11:41 PM, Ilya Maximets wrote:
>> On 8/13/25 7:20 PM, Mark Michelson via dev wrote:
>>> Prior to this commit, if you wanted to find the corresponding northbound
>>> UUID for a southbound Logical Datapath, you could find it in one of two
>>> places:
>>>
>>> For logical switches, it was in external-ids:logical-switch.
>>> For logical routers, it was in external-ids:logical-router.
>>>
>>> With this commit, we are separating the type and UUID into separate
>>> fields. This way, no matter the type of the datapath, you can find the
>>> UUID. And you can find the type of the datapath without having to
>>> potentially check multiple external-ids keys to do so.
>>>
>>> These fields are going to be used pretty heavily by northd in upcoming
>>> patches, so instead of making them external-ids, these are now
>>> full-fledged columns on the southbound Datapath_Binding. The UUID of the
>>> northbound logical datapath is in a column called "nb_uuid" and the type
>>> of the logical datapath is stored in an enumerated column called "type".
>>>
>>> Note that there is a seemingly-unrelated change in the check packet
>>> length test in tests/ovn.at. This test started failing because its use
>>> of `grep` was picking up the new "nb_uuid" column accidentally. The
>>> change to the test uses `fetch_column` since it is more precise.
>>>
>>> Signed-off-by: Mark Michelson <[email protected]>
>>> Acked-by: Ales Musil <[email protected]>
>>> ---
>>> V17:
>>> - cherry picked from v15
>>> ---
>>>  controller/local_data.c |  2 +-
>>>  ic/ovn-ic.c             |  5 +++--
>>>  lib/ovn-util.c          | 45 +++++++++++++++++++++++++++++++++++++++++
>>>  lib/ovn-util.h          |  8 ++++++++
>>>  northd/northd.c         | 38 +++++++++++++++++-----------------
>>>  northd/northd.h         |  5 +++--
>>>  ovn-sb.ovsschema        | 11 ++++++++--
>>>  ovn-sb.xml              | 21 ++++++++++---------
>>>  tests/ovn-controller.at |  4 ++++
>>>  tests/ovn-northd.at     |  6 +++---
>>>  tests/ovn.at            |  6 ++----
>>>  utilities/ovn-sbctl.c   |  4 ++--
>>>  utilities/ovn-trace.c   |  3 +--
>>>  13 files changed, 112 insertions(+), 46 deletions(-)
>>
>> <snip>
>>
>>>  /* Returns an "enum ovn_stage" built from the arguments.
>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
>>> index 4c24f5b51..8bc7bbba3 100644
>>> --- a/ovn-sb.ovsschema
>>> +++ b/ovn-sb.ovsschema
>>> @@ -1,7 +1,7 @@
>>>  {
>>>      "name": "OVN_Southbound",
>>> -    "version": "21.2.0",
>>> -    "cksum": "29145795 34859",
>>> +    "version": "21.3.0",
>>> +    "cksum": "3179330761 35286",
>>>      "tables": {
>>>          "SB_Global": {
>>>              "columns": {
>>> @@ -196,6 +196,13 @@
>>>                  "load_balancers": {"type": {"key": {"type": "uuid"},
>>>                                              "min": 0,
>>>                                              "max": "unlimited"}},
>>> +                "type": {"type": {"key": {"type": "string",
>>> +                                          "enum": ["set",
>>> +                                                      ["logical-switch",
>>> +                                                       
>>> "logical-router"]]}}},
>>
>> Hi, Mark, Dumitru, Ales.
>>

Hi Ilya,

>> This change breaks upgrades, at least in active-backup database 
>> configuration.
>>
>> When the old database is converted to the new schema, the default value for
>> the column is an empty string.  But this is against the schema definition.
>> This is not a problem for the database itself, as it doesn't check these
>> constraints, but if we have the backup database connected, it will check the
>> constraints upon receiving the new data and will fail:
>>
>> |replication|INFO|OVN_Southbound: Monitor reply received. Resetting the 
>> database.
>> |00045|ovsdb_error|ERR|unexpected ovsdb error: constraint violation: "" is 
>> not
>>                        one of the allowed values ([logical-router, 
>> logical-switch])
>>
>> At this point the backup server will disconnect and turn into active server
>> as ti thinks that the origin is broken.  We probably need to allow this
>> column to not have a value specified.
>>
>> It's fairly easy to reporoduce - get an older database and start a sandbox
>> with it.  The sb2 will error out and disconnect.
>>

Nice catch!  I managed to reproduce the problem as following your
suggestion locally: I started a 25.03 sandbox, ran ./ovn-setup.sh, saved
the NB/SB and then started a sandbox on main with the saved DBs.

>> Also, northd and ovn-controller should be able to handle this case where
>> the column is not yet set after upgrade.  I didn't check if that works or
>> not.

Yes, that part is handled correctly, see datapath_get_nb_uuid_and_type().

Are you planning to post a patch to fix this?  I assume we just need to
make the column value optional in the SB schema.

> 
> And it seems like northd is in the infinite lop of re-writing entire sb.
> Not sure if it is related or some other issue.
> 

Could it be that you didn't have the Static_Mac_Binding fix in your
local main branch?  I don't see the infinite loop in my reproducer.

https://github.com/ovn-org/ovn/commit/e8a0ade

If not, it might be good to share the DBs if possible.

>>
>> Best regards, Ilya Maximets.
> 

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to