On Thu, Jul 30, 2020 at 11:07 PM Tony Liu <tonyliu0...@hotmail.com> wrote:
> Hi Han, > > ovsdb-client backup and restore work as expected. Sorry for the false > alarm. > I messed up with the container. When restore the snapshot for nb-db, sb-db > is > updated accordingly by ovn-northd. > > Great! > I think this man page should be updated saying RAFT cluster is also > supported > by backup and restore. > http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt > > I didn't see it saying RAFT cluster is not supported in the above document. Probably you misunderstood this statement: "Reads snapshot, which must be a OVSDB standalone or active-backup database" The backup file you generated from ovsdb-client backup command is in OVSDB standalone format, which is mentioned in the "backup" document. In addition, the document ovsdb(7) also made it clear that this is the right way to backup/restore clustered DB. Did this clarify? > Thanks! > > Tony > > -----Original Message----- > > From: Han Zhou <hz...@ovn.org> > > Sent: Thursday, July 30, 2020 7:19 PM > > To: Tony Liu <tonyliu0...@hotmail.com> > > Cc: Numan Siddique <nusid...@redhat.com>; Han Zhou <hz...@ovn.org>; ovs- > > dev <ovs-dev@openvswitch.org>; ovs-discuss <ovs-disc...@openvswitch.org> > > Subject: Re: [ovs-discuss] [OVN] DB backup and restore > > > > > > > > On Thu, Jul 30, 2020 at 7:04 PM Tony Liu <tonyliu0...@hotmail.com > > <mailto:tonyliu0...@hotmail.com> > wrote: > > > > > > Hi, > > > > > > > > Just update, finally make this snapshot/rollback work for me. > > > > The rollback is not live though. Here is what I did. > > > > > > > > 1. Make a snapshot by ovsdb-client. Assuming no ongoing > > > > Transactions, and data is consistent on all nodes. The > > > > Snapshot can be done on any node. It doesn't include any > > > > cluster info. That's probably why the man page says this is > > > > for standalone and A/B only. But that cluster info seems > > > > not required to restore. > > > > > > > > 2. To rollback/restore, stop services on all nodes, starting > > > > from followers to the leader. > > > > > > > > 3. Pick a node as the new leader, copy snapshot to be the DB > > > > file. Then start the service. A cluster with new cluster ID > > > > will be created. The node will be allocated a new server ID > > > > as well. > > > > > > > > 4. On the rest two nodes, remove the DB file, restart service > > > > with remote-address pointing to the leader. > > > > > > > > Now, the new cluster starts working with the rollback data. > > > > > > The steps you gave may work, but it is weird. It is better to just > > follow the steps mentioned in this section: > > > > https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7 > > .rst#backing-up-and-restoring-a-database > > > > > > > > > > > > > > "ovs-client restore" doesn't work for me, not sure why. > > > > ==== > > > > ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type > > > > ==== > > > > I tried to restore the snapshot created by backup, also the > > > > Directly copied DB file, neither of them works. Wondering anyone > > > > experienced such issue? > > > > > > > > Maybe your command was wrong. Could you share your command line, and the > > version used? > > > > > > > > > > > > To Numan, it would great if you could share the details to use > > > > Neutron-ovn-sync-util. > > > > > > > > > > > > Thanks! > > > > > > > > Tony > > > > > > > > From: Tony Liu <mailto:tonyliu0...@hotmail.com> > > Sent: Thursday, July 30, 2020 4:51 PM > > To: Numan Siddique <mailto:nusid...@redhat.com> ; Han Zhou > > <mailto:hz...@ovn.org> > > Cc: Han Zhou <mailto:hz...@ovn.org> ; ovs-dev <mailto:ovs- > > d...@openvswitch.org> ; ovs-discuss <mailto:ovs-disc...@openvswitch.org> > > Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore > > > > > > > > Hi Numan, > > > > I found this comment you made a few years back. > > > > - At neutron-server startup, OVN ML2 driver syncs the neutron > > DB and OVN DB if sync mode is set to repair. > > - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs. > > > > Could you share the details to try those two options? > > > > > > Thanks! > > > > Tony > > > > From: Tony Liu<mailto:tonyliu0...@hotmail.com> > > Sent: Thursday, July 30, 2020 4:38 PM > > To: Han Zhou<mailto:hz...@ovn.org> > > Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs- > > d...@openvswitch.org>; ovs-discuss<mailto:ovs-disc...@openvswitch.org> > > Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore > > > > Hi, > > > > I have another thought after some diggings. Since I am with > > OpenStack, all networking configurations are from OpenStack. > > I could snapshot OpenStack MariaDB, restore and run > > neutron-ovn-db-sync to update OVN DB. Would that be a cleaner > > solution? > > > > BTW, I got this error when restore the OVN DB. > > ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type > > > > The file was created by "backup" command. > > > > > > Thanks! > > > > Tony > > > > From: Tony Liu<mailto:tonyliu0...@hotmail.com> > > Sent: Thursday, July 30, 2020 3:41 PM > > To: Han Zhou<mailto:hz...@ovn.org> > > Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-dev<mailto:ovs- > > d...@openvswitch.org>; ovs-discuss<mailto:ovs-disc...@openvswitch.org> > > Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore > > > > Hi, > > > > A quick question here. Given this man page. > > http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt > > > > It says backup and restore commands are for OVSDB standalone and > > > > active-backup databases. > > > > > > > > Can they be used for RAFT cluster? If not, what would be the > > concern, > > > > like inconsistency? > > > > > > > > If I restore to a follower, is the request going to be forwarded to > > the > > > > leader to restore DB for the whole cluster? But I believe it's > > recommended > > > > to restore to the leader directly for performance sake. > > > > > > > > I am going to give it a try anyways, see how it works. Will make > > sure > > > > there is no configuration update from OpenStack side while running > > such > > > > snapshot and restore process. > > > > > > > > > > > > Thanks! > > > > > > > > Tony > > > > From: Han Zhou<mailto:hz...@ovn.org> > > Sent: Thursday, July 30, 2020 12:23 PM > > To: Tony Liu<mailto:tonyliu0...@hotmail.com> > > Cc: Han Zhou<mailto:hz...@ovn.org>; ovs-discuss<mailto:ovs- > > disc...@openvswitch.org>; ovs-dev<mailto:ovs-dev@openvswitch.org> > > Subject: Re: [ovs-discuss] [OVN] DB backup and restore > > > > > > > > On Thu, Jul 30, 2020 at 10:56 AM Tony Liu <tonyliu0...@hotmail.com > > <mailto:tonyliu0...@hotmail.com> <mailto:tonyliu0...@hotmail.com > > <mailto:tonyliu0...@hotmail.com> >> wrote: > > Hi Han, > > > > That doc helps. I will run some tests and update here. The use case > > I want > > to cover is snapshot/rollback and backup/restore. > > > > ======== > > Actually, "at-least-once" consistency, because OVSDB does not have > > a session > > mechanism to drop duplicate transactions if a connection drops > > after the server > > commits it but before the client receives the result. > > ======== > > I saw duplicated datapath bindings for the same logical switch > once, > > if you > > recall. This may explain that. The ovn-northd connection to sb-db > > is dropped > > before receiving the result. So ovn-northd initiates another > > transaction to > > create datapath binding for the same logical switch. > > > > Yes, this is a possibility. > > However, in reality, this is usually not a problem: > > > > 1) If DB schema has table keys properly defined, the redundant > > transaction from clients would be rejected by DB server because of key > > constraint check. In the datapath binding case, this doesn't work > > because of the poor definition of the datapath_binding table. It should > > have had "logical_switch_router" column defined and set as a key (in > > addition to the "tunnel_key") instead of storing it in external_ids. The > > duplicated entries would have been avoided. The other tables such as > > port_binding would never have such problem. > > > > 2) OVSDB clients usually monitors and syncs all (interested) data > > from server to local, so when they do declarative processing, they could > > correct problems by themselves. In fact, ovn-northd does the check and > > deletes duplicated datapaths. I did a simple test and it did cleanup by > > itself: > > 2020-07-30T18:55:53.057Z|00006|ovn_northd|INFO|ovn-northd lock > > acquired. This ovn-northd instance is now active. > > 2020-07-30T19:02:10.465Z|00007|ovn_northd|INFO|deleting > > Datapath_Binding abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate > > external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e > > > > I am not sure why in your case north was stuck, but I agree there > > must be something wrong. Please collect northd logs if you encounter > > this again so we can dig further. > > > > I see two ways to improve it. > > 1) On client side, if the connection is broken while waiting for > > the result > > of a transaction, the client checks the transaction state, > > committed or not, > > when it reconnects to the leader (maybe a different node). > > Do we have such check today? > > > > Clients does check. In this case when transaction was actually > > successful but appears to be failed from client point of view, the check > > doesn't help. > > > > 2) I see client connection is dropped by the leader when it's busy. > > I don't > > think this is a good way to control the traffic. The server can > > cache and > > hold the request when it's busy, or even push back. Dropping > > connection > > is not a good option. Any thoughts here? > > > > The server doesn't make this kind of decisions. It could be simply > > overloaded and disconnected from the cluster, or even worse, a node > > could crash after commiting the transaction. > > > > Thanks, > > Han > > > > > > Thanks! > > > > Tony > > > > From: Han Zhou<mailto:hz...@ovn.org> > > Sent: Wednesday, July 29, 2020 11:38 PM > > To: Tony Liu<mailto:tonyliu0...@hotmail.com> > > Cc: ovs-discuss<mailto:ovs-disc...@openvswitch.org>; ovs- > > dev<mailto:ovs-dev@openvswitch.org> > > Subject: Re: [ovs-discuss] [OVN] DB backup and restore > > > > > > > > On Wed, Jul 29, 2020 at 10:58 PM Tony Liu <tonyliu0...@hotmail.com > > <mailto:tonyliu0...@hotmail.com> <mailto:tonyliu0...@hotmail.com > > <mailto:tonyliu0...@hotmail.com> >> wrote: > > > > > > Hi, > > > > > > > > > > > > There is any guidance to backup and restore OVN nb-db and sb-db? > > > > > > > > > > > > Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database > > file? > > > > > > > > > > > > For 3-node DB cluster, is replication 3 (the data is replicated > > onto > > > > > > All 3 nodes)? > > > > > > > > > > > > Are DB files on 3 nodes identical? > > > > > > > > > > > > If I stop a DB follower and empty the DB file on the follower > > node, > > > > > > when I start it back, is the whole DB going to be replicated to > > it? > > > > > > > > > > > > To backup the DB, is it OK to copy the DB file from any node, > > assuming > > > > > > no transaction ongoing? > > > > > > > > > > > > Is the following going to work to restore the DB? > > > > > > * Stop all 3 DBs. > > > > > > * Copy backup DB file to one node, empty DB file on the rest two > > nodes. > > > > > > * Bootstrap the node with DB file. > > > > > > * Start the rest two nodes to join the cluster. > > > > > > > For ovsdb operations, please refer to "man 7 ovsdb", or here: > > https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7 > > .rst > > <https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb. > > 7.rst> > > > > > > > > > > > Do I need to restore sb-db as well? Or restore nb-db only and let > > > > > > ovn-northd to sync data from nb-db to sb-db. Chassis data should > > be > > > > > > updated by onv-controller? > > > > > > > You don't have to restore sb-db. ovn-northd and ovn-controllers > > will sync the data in SB DB. > > However, it may take quite some time to sync if the scale is large. > > Also, remember that the mac_binding table in SB will not be > > restored by ovn-controller because it is populated as a result of ARP > > packets handling by ovn-controller. The entries will be generated again > > only if new ARP packets are observed by ovn-controller. > > > > > > > > > > > I am running scaling test. It takes quite a lot of time to build > > > > > > Configurations. Wondering if I can back and restore DB to > > rollback > > > > > > to some checkpoint to avoid restart all over. > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > Tony > > > > > > > > > > > > _______________________________________________ > > > discuss mailing list > > > disc...@openvswitch.org <mailto:disc...@openvswitch.org> > > <mailto:disc...@openvswitch.org <mailto:disc...@openvswitch.org> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org <mailto:d...@openvswitch.org> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org <mailto:d...@openvswitch.org> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org <mailto:d...@openvswitch.org> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev