Mmm... nb-db rolled back, but sb-db is not re-synced, ovn-northd complaints "clustered database server has stale data; trying another server". Any way to workaround it or I need to snapshot and rollback sb-db as well?
Thanks! Tony From: Tony Liu<mailto:tonyliu0...@hotmail.com> Sent: Thursday, July 30, 2020 7:04 PM To: Numan Siddique<mailto:nusid...@redhat.com>; Han Zhou<mailto:hz...@ovn.org> Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs-...@openvswitch.org>; ovs-discuss<mailto:ovs-discuss@openvswitch.org> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore Hi, Just update, finally make this snapshot/rollback work for me. The rollback is not live though. Here is what I did. 1. Make a snapshot by ovsdb-client. Assuming no ongoing Transactions, and data is consistent on all nodes. The Snapshot can be done on any node. It doesn't include any cluster info. That's probably why the man page says this is for standalone and A/B only. But that cluster info seems not required to restore. 2. To rollback/restore, stop services on all nodes, starting from followers to the leader. 3. Pick a node as the new leader, copy snapshot to be the DB file. Then start the service. A cluster with new cluster ID will be created. The node will be allocated a new server ID as well. 4. On the rest two nodes, remove the DB file, restart service with remote-address pointing to the leader. Now, the new cluster starts working with the rollback data. "ovs-client restore" doesn't work for me, not sure why. ==== ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type ==== I tried to restore the snapshot created by backup, also the Directly copied DB file, neither of them works. Wondering anyone experienced such issue? To Numan, it would great if you could share the details to use Neutron-ovn-sync-util. Thanks! Tony From: Tony Liu<mailto:tonyliu0...@hotmail.com> Sent: Thursday, July 30, 2020 4:51 PM To: Numan Siddique<mailto:nusid...@redhat.com>; Han Zhou<mailto:hz...@ovn.org> Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs-...@openvswitch.org>; ovs-discuss<mailto:ovs-discuss@openvswitch.org> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore Hi Numan, I found this comment you made a few years back. - At neutron-server startup, OVN ML2 driver syncs the neutron DB and OVN DB if sync mode is set to repair. - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs. Could you share the details to try those two options? Thanks! Tony From: Tony Liu<mailto:tonyliu0...@hotmail.com> Sent: Thursday, July 30, 2020 4:38 PM To: Han Zhou<mailto:hz...@ovn.org> Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs-...@openvswitch.org>; ovs-discuss<mailto:ovs-discuss@openvswitch.org> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore Hi, I have another thought after some diggings. Since I am with OpenStack, all networking configurations are from OpenStack. I could snapshot OpenStack MariaDB, restore and run neutron-ovn-db-sync to update OVN DB. Would that be a cleaner solution? BTW, I got this error when restore the OVN DB. ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type The file was created by "backup" command. Thanks! Tony From: Tony Liu<mailto:tonyliu0...@hotmail.com> Sent: Thursday, July 30, 2020 3:41 PM To: Han Zhou<mailto:hz...@ovn.org> Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs-...@openvswitch.org>; ovs-discuss<mailto:ovs-discuss@openvswitch.org> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore Hi, A quick question here. Given this man page. http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt It says backup and restore commands are for OVSDB standalone and active-backup databases. Can they be used for RAFT cluster? If not, what would be the concern, like inconsistency? If I restore to a follower, is the request going to be forwarded to the leader to restore DB for the whole cluster? But I believe it's recommended to restore to the leader directly for performance sake. I am going to give it a try anyways, see how it works. Will make sure there is no configuration update from OpenStack side while running such snapshot and restore process. Thanks! Tony From: Han Zhou<mailto:hz...@ovn.org> Sent: Thursday, July 30, 2020 12:23 PM To: Tony Liu<mailto:tonyliu0...@hotmail.com> Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-discuss<mailto:ovs-discuss@openvswitch.org>; ovs-dev<mailto:ovs-...@openvswitch.org> Subject: Re: [ovs-discuss] [OVN] DB backup and restore On Thu, Jul 30, 2020 at 10:56 AM Tony Liu <tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>> wrote: Hi Han, That doc helps. I will run some tests and update here. The use case I want to cover is snapshot/rollback and backup/restore. ======== Actually, "at-least-once" consistency, because OVSDB does not have a session mechanism to drop duplicate transactions if a connection drops after the server commits it but before the client receives the result. ======== I saw duplicated datapath bindings for the same logical switch once, if you recall. This may explain that. The ovn-northd connection to sb-db is dropped before receiving the result. So ovn-northd initiates another transaction to create datapath binding for the same logical switch. Yes, this is a possibility. However, in reality, this is usually not a problem: 1) If DB schema has table keys properly defined, the redundant transaction from clients would be rejected by DB server because of key constraint check. In the datapath binding case, this doesn't work because of the poor definition of the datapath_binding table. It should have had "logical_switch_router" column defined and set as a key (in addition to the "tunnel_key") instead of storing it in external_ids. The duplicated entries would have been avoided. The other tables such as port_binding would never have such problem. 2) OVSDB clients usually monitors and syncs all (interested) data from server to local, so when they do declarative processing, they could correct problems by themselves. In fact, ovn-northd does the check and deletes duplicated datapaths. I did a simple test and it did cleanup by itself: 2020-07-30T18:55:53.057Z|00006|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active. 2020-07-30T19:02:10.465Z|00007|ovn_northd|INFO|deleting Datapath_Binding abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e I am not sure why in your case north was stuck, but I agree there must be something wrong. Please collect northd logs if you encounter this again so we can dig further. I see two ways to improve it. 1) On client side, if the connection is broken while waiting for the result of a transaction, the client checks the transaction state, committed or not, when it reconnects to the leader (maybe a different node). Do we have such check today? Clients does check. In this case when transaction was actually successful but appears to be failed from client point of view, the check doesn't help. 2) I see client connection is dropped by the leader when it's busy. I don't think this is a good way to control the traffic. The server can cache and hold the request when it's busy, or even push back. Dropping connection is not a good option. Any thoughts here? The server doesn't make this kind of decisions. It could be simply overloaded and disconnected from the cluster, or even worse, a node could crash after commiting the transaction. Thanks, Han Thanks! Tony From: Han Zhou<mailto:hz...@ovn.org> Sent: Wednesday, July 29, 2020 11:38 PM To: Tony Liu<mailto:tonyliu0...@hotmail.com> Cc: ovs-discuss<mailto:ovs-discuss@openvswitch.org>; ovs-dev<mailto:ovs-...@openvswitch.org> Subject: Re: [ovs-discuss] [OVN] DB backup and restore On Wed, Jul 29, 2020 at 10:58 PM Tony Liu <tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>> wrote: > > Hi, > > > > There is any guidance to backup and restore OVN nb-db and sb-db? > > > > Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file? > > > > For 3-node DB cluster, is replication 3 (the data is replicated onto > > All 3 nodes)? > > > > Are DB files on 3 nodes identical? > > > > If I stop a DB follower and empty the DB file on the follower node, > > when I start it back, is the whole DB going to be replicated to it? > > > > To backup the DB, is it OK to copy the DB file from any node, assuming > > no transaction ongoing? > > > > Is the following going to work to restore the DB? > > * Stop all 3 DBs. > > * Copy backup DB file to one node, empty DB file on the rest two nodes. > > * Bootstrap the node with DB file. > > * Start the rest two nodes to join the cluster. > For ovsdb operations, please refer to "man 7 ovsdb", or here: https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst > > > Do I need to restore sb-db as well? Or restore nb-db only and let > > ovn-northd to sync data from nb-db to sb-db. Chassis data should be > > updated by onv-controller? > You don't have to restore sb-db. ovn-northd and ovn-controllers will sync the data in SB DB. However, it may take quite some time to sync if the scale is large. Also, remember that the mac_binding table in SB will not be restored by ovn-controller because it is populated as a result of ARP packets handling by ovn-controller. The entries will be generated again only if new ARP packets are observed by ovn-controller. > > > I am running scaling test. It takes quite a lot of time to build > > Configurations. Wondering if I can back and restore DB to rollback > > to some checkpoint to avoid restart all over. > > > > > > Thanks! > > > > Tony > > > > _______________________________________________ > discuss mailing list > disc...@openvswitch.org<mailto:disc...@openvswitch.org> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss