Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for the same logical-switch

2020-07-30 Thread Han Zhou
resend as plain text, since I got "The message's content type was not
explicitly allowed" reply from ovs-dev-owner.

On Thu, Jul 30, 2020 at 7:30 PM Han Zhou  wrote:
>
>
>
> On Thu, Jul 30, 2020 at 7:24 PM Tony Liu  wrote:
>>
>> Hi Han,
>>
>>
>>
>> Continue with this thread. Regarding to your comment in another thread.
>>
>> ===
>>
>> 2) OVSDB clients usually monitors and syncs all (interested) data from
server to local, so when they do declarative processing, they could correct
problems by themselves. In fact, ovn-northd does the check and deletes
duplicated datapaths. I did a simple test and it did cleanup by itself:
>>
>> 2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired.
This ovn-northd instance is now active.
>> 2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e
>>
>>
>>
>> I am not sure why in your case north was stuck, but I agree there must
be something wrong. Please collect northd logs if you encounter this again
so we can dig further.
>>
>> ===
>>
>>
>>
>> You are right that ovn-northd will try to clean up the duplication, but,
>>
>> there are ports in port-binding referencing to this datapath-binding, so
>>
>> ovn-northd fails to delete the datapath-binding. I have to manually
delete
>>
>> those ports to be able to delete the datapath-binding. I believe it’s not
>>
>> supported for ovn-northd to delete a configuration that is being
>>
>> referenced. Is that right? If yes, should we fix it or it's the
intention?
>>
>>
>
>
> Yes, good point!
> It is definitely a bug and we should fix it. I think the best fix is to
change the schema and add "logical_datapath" as a index, but we'll need to
make it backward compatible to avoid upgrade issues.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for the same logical-switch

2020-07-30 Thread Tony Liu
I agree, that will stop the duplication from being created.


Thanks!

Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 7:30 PM
To: Tony Liu
Cc: Ben Pfaff; ovs-dev; 
ovs-discuss@openvswitch.org
Subject: Re: [ovs-dev] OVN: Two datapath-bindings are created for the same 
logical-switch



On Thu, Jul 30, 2020 at 7:24 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

Continue with this thread. Regarding to your comment in another thread.
===
2) OVSDB clients usually monitors and syncs all (interested) data from server 
to local, so when they do declarative processing, they could correct problems 
by themselves. In fact, ovn-northd does the check and deletes duplicated 
datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired. This 
ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding 
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate 
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be 
something wrong. Please collect northd logs if you encounter this again so we 
can dig further.
===

You are right that ovn-northd will try to clean up the duplication, but,
there are ports in port-binding referencing to this datapath-binding, so
ovn-northd fails to delete the datapath-binding. I have to manually delete
those ports to be able to delete the datapath-binding. I believe it’s not
supported for ovn-northd to delete a configuration that is being
referenced. Is that right? If yes, should we fix it or it's the intention?


Yes, good point!
It is definitely a bug and we should fix it. I think the best fix is to change 
the schema and add "logical_datapath" as a index, but we'll need to make it 
backward compatible to avoid upgrade issues.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 23, 2020 7:51 PM
To: Han Zhou; Ben Pfaff
Cc: ovs-dev; 
ovs-discuss@openvswitch.org
Subject: Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for 
the same logical-switch

Hi Han,

Thanks for taking the time to look into this. This problem is not consistently 
reproduced.
Developers normally ignore it:) I think we collected enough context and we can 
let it go for now.
I will rebuild setup, tune that RAFT heartbeat timer and rerun the test. Will 
keep you posted.


Thanks again!

Tony


From: Han Zhou mailto:zhou...@gmail.com>>
Sent: July 23, 2020 06:53 PM
To: Tony Liu mailto:tonyliu0...@hotmail.com>>; Ben 
Pfaff mailto:b...@ovn.org>>
Cc: Numan Siddique mailto:num...@ovn.org>>; ovs-dev 
mailto:ovs-...@openvswitch.org>>; 
ovs-discuss@openvswitch.org 
mailto:ovs-discuss@openvswitch.org>>
Subject: Re: [ovs-dev] OVN: Two datapath-bindings are created for the same 
logical-switch


On Thu, Jul 23, 2020 at 10:33 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
>
> Changed the title for this specific problem.
> I looked into logs and have more findings.
> The problem was happening when sb-db leader switched.

Hi Tony,

Thanks for this detailed information. Could you confirm which version of OVS is 
used (to understand OVSDB behavior).

>
> For ovsdb cluster, what may trigger the leader switch? Given the log,
> 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout expired, 
> starting election
> The election is asked by a follower node. Is that because the connection from 
> follower to leader timeout,
> then follower assumes the leader is dead and starts an election?

You are right, the RAFT heart beat would timeout when server is too busy and 
the election timer is too small (default 1s). For large scale test, please 
increase the election timer by:
ovn-appctl -t  cluster/change-election-timer OVN_Southbound 

I suggest to set  to be at least bigger than 1 or more in your case. 
(you need to increase the value gradually - 2000, 4000, 8000, 16000 - so it 
will take you 4 commands to reach this from the initial default value 1000, not 
very convenient, I know :)

 here is the path to the socket ctl file of ovn-sb, usually under 
/var/run/ovn.

>
> For ovn-northd (3 instances), they all connect to the sb-db leader, whoever 
> has the locker is the master.
> When sb-db leader switches, all ovn-northd instances look for the new leader. 
> In this case, there is no
> guarantee that the old ovn-northd master remains the role, other ovn-northd 
> instance may find the
> leader and acquire the lock first. So, the sb-db leader switch may also cause 
> ovn-northd master switch.
> Such switch may happen in the middle of 

Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for the same logical-switch

2020-07-30 Thread Han Zhou
On Thu, Jul 30, 2020 at 7:24 PM Tony Liu  wrote:

> Hi Han,
>
>
>
> Continue with this thread. Regarding to your comment in another thread.
>
> ===
>
> 2) OVSDB clients usually monitors and syncs all (interested) data from
> server to local, so when they do declarative processing, they could correct
> problems by themselves. In fact, ovn-northd does the check and deletes
> duplicated datapaths. I did a simple test and it did cleanup by itself:
>
> 2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired.
> This ovn-northd instance is now active.
> 2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding
> abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate
> external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e
>
>
>
> I am not sure why in your case north was stuck, but I agree there must be
> something wrong. Please collect northd logs if you encounter this again so
> we can dig further.
>
> ===
>
>
>
> You are right that ovn-northd will try to clean up the duplication, but,
>
> there are ports in port-binding referencing to this datapath-binding, so
>
> ovn-northd fails to delete the datapath-binding. I have to manually delete
>
> those ports to be able to delete the datapath-binding. I believe it’s not
>
> supported for ovn-northd to delete a configuration that is being
>
> referenced. Is that right? If yes, should we fix it or it's the intention?
>
>
>

Yes, good point!
It is definitely a bug and we should fix it. I think the best fix is to
change the schema and add "logical_datapath" as a index, but we'll need to
make it backward compatible to avoid upgrade issues.


>
>
> Thanks!
>
>
>
> Tony
>
>
>
> *From: *Tony Liu 
> *Sent: *Thursday, July 23, 2020 7:51 PM
> *To: *Han Zhou ; Ben Pfaff 
> *Cc: *ovs-dev ; ovs-discuss@openvswitch.org
> *Subject: *Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are
> created for the same logical-switch
>
>
>
> Hi Han,
>
>
>
> Thanks for taking the time to look into this. This problem is not
> consistently reproduced.
>
> Developers normally ignore it:) I think we collected enough context and we
> can let it go for now.
>
> I will rebuild setup, tune that RAFT heartbeat timer and rerun the test.
> Will keep you posted.
>
>
>
>
>
> Thanks again!
>
>
>
> Tony
>
>
>
> *From:* Han Zhou 
> *Sent:* July 23, 2020 06:53 PM
> *To:* Tony Liu ; Ben Pfaff 
> *Cc:* Numan Siddique ; ovs-dev ;
> ovs-discuss@openvswitch.org 
> *Subject:* Re: [ovs-dev] OVN: Two datapath-bindings are created for the
> same logical-switch
>
>
>
>
> On Thu, Jul 23, 2020 at 10:33 AM Tony Liu  wrote:
> >
> > Changed the title for this specific problem.
> > I looked into logs and have more findings.
> > The problem was happening when sb-db leader switched.
>
>
>
> Hi Tony,
>
>
>
> Thanks for this detailed information. Could you confirm which version of
> OVS is used (to understand OVSDB behavior).
>
>
>
> >
> > For ovsdb cluster, what may trigger the leader switch? Given the log,
> > 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout
> expired, starting election
> > The election is asked by a follower node. Is that because the connection
> from follower to leader timeout,
> > then follower assumes the leader is dead and starts an election?
>
>
>
> You are right, the RAFT heart beat would timeout when server is too busy
> and the election timer is too small (default 1s). For large scale test,
> please increase the election timer by:
>
> ovn-appctl -t  cluster/change-election-timer OVN_Southbound 
>
>
>
> I suggest to set  to be at least bigger than 1 or more in your
> case. (you need to increase the value gradually - 2000, 4000, 8000, 16000 -
> so it will take you 4 commands to reach this from the initial default value
> 1000, not very convenient, I know :)
>
>
>
>  here is the path to the socket ctl file of ovn-sb, usually under
> /var/run/ovn.
>
>
>
> >
>
> > For ovn-northd (3 instances), they all connect to the sb-db leader,
> whoever has the locker is the master.
> > When sb-db leader switches, all ovn-northd instances look for the new
> leader. In this case, there is no
> > guarantee that the old ovn-northd master remains the role, other
> ovn-northd instance may find the
> > leader and acquire the lock first. So, the sb-db leader switch may also
> cause ovn-northd master switch.
> > Such switch may happen in the middle of ovn-northd transaction, in that
> case, is there any guarantee to
> > the transaction completeness? My guess is that, the older created a
> datapath-binding for a logical-switch,
> > switch happened when this transaction is not completed, then the new
> master/leader created another
> > data-path binding for the same logical-switch. Does it make any sense?
>
>
>
> I agree with you it could be related to the failover and the lock behavior
> during the failover. It could be a lock problem causing 2 northds became
> active at the same time for a short moment. However, I still can't 

Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for the same logical-switch

2020-07-30 Thread Tony Liu
Hi Han,

Continue with this thread. Regarding to your comment in another thread.
===
2) OVSDB clients usually monitors and syncs all (interested) data from server 
to local, so when they do declarative processing, they could correct problems 
by themselves. In fact, ovn-northd does the check and deletes duplicated 
datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired. This 
ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding 
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate 
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be 
something wrong. Please collect northd logs if you encounter this again so we 
can dig further.
===

You are right that ovn-northd will try to clean up the duplication, but,
there are ports in port-binding referencing to this datapath-binding, so
ovn-northd fails to delete the datapath-binding. I have to manually delete
those ports to be able to delete the datapath-binding. I believe it’s not
supported for ovn-northd to delete a configuration that is being
referenced. Is that right? If yes, should we fix it or it's the intention?


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 23, 2020 7:51 PM
To: Han Zhou; Ben Pfaff
Cc: ovs-dev; 
ovs-discuss@openvswitch.org
Subject: Re: [ovs-discuss] [ovs-dev] OVN: Two datapath-bindings are created for 
the same logical-switch

Hi Han,

Thanks for taking the time to look into this. This problem is not consistently 
reproduced.
Developers normally ignore it:) I think we collected enough context and we can 
let it go for now.
I will rebuild setup, tune that RAFT heartbeat timer and rerun the test. Will 
keep you posted.


Thanks again!

Tony


From: Han Zhou 
Sent: July 23, 2020 06:53 PM
To: Tony Liu ; Ben Pfaff 
Cc: Numan Siddique ; ovs-dev ; 
ovs-discuss@openvswitch.org 
Subject: Re: [ovs-dev] OVN: Two datapath-bindings are created for the same 
logical-switch


On Thu, Jul 23, 2020 at 10:33 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
>
> Changed the title for this specific problem.
> I looked into logs and have more findings.
> The problem was happening when sb-db leader switched.

Hi Tony,

Thanks for this detailed information. Could you confirm which version of OVS is 
used (to understand OVSDB behavior).

>
> For ovsdb cluster, what may trigger the leader switch? Given the log,
> 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout expired, 
> starting election
> The election is asked by a follower node. Is that because the connection from 
> follower to leader timeout,
> then follower assumes the leader is dead and starts an election?

You are right, the RAFT heart beat would timeout when server is too busy and 
the election timer is too small (default 1s). For large scale test, please 
increase the election timer by:
ovn-appctl -t  cluster/change-election-timer OVN_Southbound 

I suggest to set  to be at least bigger than 1 or more in your case. 
(you need to increase the value gradually - 2000, 4000, 8000, 16000 - so it 
will take you 4 commands to reach this from the initial default value 1000, not 
very convenient, I know :)

 here is the path to the socket ctl file of ovn-sb, usually under 
/var/run/ovn.

>
> For ovn-northd (3 instances), they all connect to the sb-db leader, whoever 
> has the locker is the master.
> When sb-db leader switches, all ovn-northd instances look for the new leader. 
> In this case, there is no
> guarantee that the old ovn-northd master remains the role, other ovn-northd 
> instance may find the
> leader and acquire the lock first. So, the sb-db leader switch may also cause 
> ovn-northd master switch.
> Such switch may happen in the middle of ovn-northd transaction, in that case, 
> is there any guarantee to
> the transaction completeness? My guess is that, the older created a 
> datapath-binding for a logical-switch,
> switch happened when this transaction is not completed, then the new 
> master/leader created another
> data-path binding for the same logical-switch. Does it make any sense?

I agree with you it could be related to the failover and the lock behavior 
during the failover. It could be a lock problem causing 2 northds became active 
at the same time for a short moment. However, I still can't imagine how the 
duplicated entries are created with different tunnel keys. If both northd 
create the datapath binding for the same LS at the same time, they should 
allocate the same tunnel key, and then one of them should fail during the 
transaction commit because of index conflict in DB. But here they have 
different keys so both were inserted in DB.

(OVSDB 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Han Zhou
On Thu, Jul 30, 2020 at 7:04 PM Tony Liu  wrote:

> Hi,
>
>
>
> Just update, finally make this snapshot/rollback work for me.
>
> The rollback is not live though. Here is what I did.
>
>
>
> 1. Make a snapshot by ovsdb-client. Assuming no ongoing
>
>Transactions, and data is consistent on all nodes. The
>
>Snapshot can be done on any node. It doesn't include any
>
>cluster info. That's probably why the man page says this is
>
>for standalone and A/B only. But that cluster info seems
>
>not required to restore.
>
>
>
> 2. To rollback/restore, stop services on all nodes, starting
>
>from followers to the leader.
>
>
>
> 3. Pick a node as the new leader, copy snapshot to be the DB
>
>file. Then start the service. A cluster with new cluster ID
>
>will be created. The node will be allocated a new server ID
>
>as well.
>
>
>
> 4. On the rest two nodes, remove the DB file, restart service
>
>with remote-address pointing to the leader.
>
>
>
> Now, the new cluster starts working with the rollback data.
>

The steps you gave may work, but it is weird. It is better to just follow
the steps mentioned in this section:

https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst#backing-up-and-restoring-a-database


>
> "ovs-client restore" doesn't work for me, not sure why.
>
> 
>
> ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type
>
> 
>
> I tried to restore the snapshot created by backup, also the
>
> Directly copied DB file, neither of them works. Wondering anyone
>
> experienced such issue?
>
>
>
Maybe your command was wrong. Could you share your command line, and the
version used?


> To Numan, it would great if you could share the details to use
>
> Neutron-ovn-sync-util.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
>
>
> *From: *Tony Liu 
> *Sent: *Thursday, July 30, 2020 4:51 PM
> *To: *Numan Siddique ; Han Zhou 
> *Cc: *Han Zhou ; ovs-dev ;
> ovs-discuss 
> *Subject: *Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
>
>
> Hi Numan,
>
> I found this comment you made a few years back.
>
> - At neutron-server startup, OVN ML2 driver syncs the neutron
> DB and OVN DB if sync mode is set to repair.
> - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.
>
> Could you share the details to try those two options?
>
>
> Thanks!
>
> Tony
>
> From: Tony Liu
> Sent: Thursday, July 30, 2020 4:38 PM
> To: Han Zhou
> Cc: Han Zhou; ovs-dev<
> mailto:ovs-...@openvswitch.org >; ovs-discuss<
> mailto:ovs-discuss@openvswitch.org >
> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
> Hi,
>
> I have another thought after some diggings. Since I am with
> OpenStack, all networking configurations are from OpenStack.
> I could snapshot OpenStack MariaDB, restore and run
> neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
> solution?
>
> BTW, I got this error when restore the OVN DB.
> ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type
>
> The file was created by "backup" command.
>
>
> Thanks!
>
> Tony
>
> From: Tony Liu
> Sent: Thursday, July 30, 2020 3:41 PM
> To: Han Zhou
> Cc: Han Zhou; ovs-dev<
> mailto:ovs-...@openvswitch.org >; ovs-discuss<
> mailto:ovs-discuss@openvswitch.org >
> Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore
>
> Hi,
>
> A quick question here. Given this man page.
> http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt
>
> It says backup and restore commands are for OVSDB standalone and
>
> active-backup databases.
>
>
>
> Can they be used for RAFT cluster? If not, what would be the concern,
>
> like inconsistency?
>
>
>
> If I restore to a follower, is the request going to be forwarded to the
>
> leader to restore DB for the whole cluster? But I believe it's recommended
>
> to restore to the leader directly for performance sake.
>
>
>
> I am going to give it a try anyways, see how it works. Will make sure
>
> there is no configuration update from OpenStack side while running such
>
> snapshot and restore process.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
> From: Han Zhou
> Sent: Thursday, July 30, 2020 12:23 PM
> To: Tony Liu
> Cc: Han Zhou; ovs-discuss<
> mailto:ovs-discuss@openvswitch.org >;
> ovs-dev
> Subject: Re: [ovs-discuss] [OVN] DB backup and restore
>
>
>
> On Thu, Jul 30, 2020 at 10:56 AM Tony Liu  tonyliu0...@hotmail.com>> wrote:
> Hi Han,
>
> That doc helps. I will run some tests and update here. The use case I want
> to cover is snapshot/rollback and backup/restore.
>
> 
> Actually, "at-least-once" consistency, because OVSDB does not have a
> session
> mechanism to drop duplicate transactions if a connection drops after the
> server
> commits it but before the client 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Mmm... nb-db rolled back, but sb-db is not re-synced, ovn-northd
complaints "clustered database server has stale data; trying
another server". Any way to workaround it or I need to snapshot
and rollback sb-db as well?


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 7:04 PM
To: Numan Siddique; Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

Just update, finally make this snapshot/rollback work for me.
The rollback is not live though. Here is what I did.

1. Make a snapshot by ovsdb-client. Assuming no ongoing
   Transactions, and data is consistent on all nodes. The
   Snapshot can be done on any node. It doesn't include any
   cluster info. That's probably why the man page says this is
   for standalone and A/B only. But that cluster info seems
   not required to restore.

2. To rollback/restore, stop services on all nodes, starting
   from followers to the leader.

3. Pick a node as the new leader, copy snapshot to be the DB
   file. Then start the service. A cluster with new cluster ID
   will be created. The node will be allocated a new server ID
   as well.

4. On the rest two nodes, remove the DB file, restart service
   with remote-address pointing to the leader.

Now, the new cluster starts working with the rollback data.

"ovs-client restore" doesn't work for me, not sure why.

ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

I tried to restore the snapshot created by backup, also the
Directly copied DB file, neither of them works. Wondering anyone
experienced such issue?

To Numan, it would great if you could share the details to use
Neutron-ovn-sync-util.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 4:51 PM
To: Numan Siddique; Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi Numan,

I found this comment you made a few years back.

- At neutron-server startup, OVN ML2 driver syncs the neutron
DB and OVN DB if sync mode is set to repair.
- Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.

Could you share the details to try those two options?


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 4:38 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

I have another thought after some diggings. Since I am with
OpenStack, all networking configurations are from OpenStack.
I could snapshot OpenStack MariaDB, restore and run
neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
solution?

BTW, I got this error when restore the OVN DB.
ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

The file was created by "backup" command.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 3:41 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

A quick question here. Given this man page.
http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt

It says backup and restore commands are for OVSDB standalone and

active-backup databases.



Can they be used for RAFT cluster? If not, what would be the concern,

like inconsistency?



If I restore to a follower, is the request going to be forwarded to the

leader to restore DB for the whole cluster? But I believe it's recommended

to restore to the leader directly for performance sake.



I am going to give it a try anyways, see how it works. Will make sure

there is no configuration update from OpenStack side while running such

snapshot and restore process.





Thanks!



Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 12:23 PM
To: Tony Liu
Cc: Han Zhou; 
ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Thu, Jul 30, 2020 at 10:56 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Hi,

Just update, finally make this snapshot/rollback work for me.
The rollback is not live though. Here is what I did.

1. Make a snapshot by ovsdb-client. Assuming no ongoing
   Transactions, and data is consistent on all nodes. The
   Snapshot can be done on any node. It doesn't include any
   cluster info. That's probably why the man page says this is
   for standalone and A/B only. But that cluster info seems
   not required to restore.

2. To rollback/restore, stop services on all nodes, starting
   from followers to the leader.

3. Pick a node as the new leader, copy snapshot to be the DB
   file. Then start the service. A cluster with new cluster ID
   will be created. The node will be allocated a new server ID
   as well.

4. On the rest two nodes, remove the DB file, restart service
   with remote-address pointing to the leader.

Now, the new cluster starts working with the rollback data.

"ovs-client restore" doesn't work for me, not sure why.

ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

I tried to restore the snapshot created by backup, also the
Directly copied DB file, neither of them works. Wondering anyone
experienced such issue?

To Numan, it would great if you could share the details to use
Neutron-ovn-sync-util.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 4:51 PM
To: Numan Siddique; Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi Numan,

I found this comment you made a few years back.

- At neutron-server startup, OVN ML2 driver syncs the neutron
DB and OVN DB if sync mode is set to repair.
- Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.

Could you share the details to try those two options?


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 4:38 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

I have another thought after some diggings. Since I am with
OpenStack, all networking configurations are from OpenStack.
I could snapshot OpenStack MariaDB, restore and run
neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
solution?

BTW, I got this error when restore the OVN DB.
ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

The file was created by "backup" command.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 3:41 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

A quick question here. Given this man page.
http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt

It says backup and restore commands are for OVSDB standalone and

active-backup databases.



Can they be used for RAFT cluster? If not, what would be the concern,

like inconsistency?



If I restore to a follower, is the request going to be forwarded to the

leader to restore DB for the whole cluster? But I believe it's recommended

to restore to the leader directly for performance sake.



I am going to give it a try anyways, see how it works. Will make sure

there is no configuration update from OpenStack side while running such

snapshot and restore process.





Thanks!



Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 12:23 PM
To: Tony Liu
Cc: Han Zhou; 
ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Thu, Jul 30, 2020 at 10:56 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits it but before the client receives the result.

I saw duplicated datapath bindings for the same logical switch once, if you
recall. This may explain that. The ovn-northd connection to sb-db is dropped
before receiving the result. So ovn-northd initiates another transaction to
create datapath binding for the same logical switch.

Yes, this is a possibility.
However, in reality, this is usually not a problem:

1) If DB schema has table keys properly defined, the redundant transaction from 
clients would be rejected by DB server because of key constraint check. In the 
datapath 

[ovs-discuss] [OVN] Where is DB cluster info stored?

2020-07-30 Thread Tony Liu

Hi,

Where is RAFT DB cluster info stored?
I see cluster ID in the DB file, but where is server ID stored?


Thanks!

Tony

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] Where is DB cluster info stored?

2020-07-30 Thread Tony Liu
Sorry for bothering, please discard this question.
Server ID is in DB file as well.

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 5:22 PM
To: ovs-discuss
Subject: [OVN] Where is DB cluster info stored?


Hi,

Where is RAFT DB cluster info stored?
I see cluster ID in the DB file, but where is server ID stored?


Thanks!

Tony


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Hi Numan,

I found this comment you made a few years back.

- At neutron-server startup, OVN ML2 driver syncs the neutron
DB and OVN DB if sync mode is set to repair.
- Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.

Could you share the details to try those two options?


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 4:38 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

I have another thought after some diggings. Since I am with
OpenStack, all networking configurations are from OpenStack.
I could snapshot OpenStack MariaDB, restore and run
neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
solution?

BTW, I got this error when restore the OVN DB.
ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

The file was created by "backup" command.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 3:41 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

A quick question here. Given this man page.
http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt

It says backup and restore commands are for OVSDB standalone and

active-backup databases.



Can they be used for RAFT cluster? If not, what would be the concern,

like inconsistency?



If I restore to a follower, is the request going to be forwarded to the

leader to restore DB for the whole cluster? But I believe it's recommended

to restore to the leader directly for performance sake.



I am going to give it a try anyways, see how it works. Will make sure

there is no configuration update from OpenStack side while running such

snapshot and restore process.





Thanks!



Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 12:23 PM
To: Tony Liu
Cc: Han Zhou; 
ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Thu, Jul 30, 2020 at 10:56 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits it but before the client receives the result.

I saw duplicated datapath bindings for the same logical switch once, if you
recall. This may explain that. The ovn-northd connection to sb-db is dropped
before receiving the result. So ovn-northd initiates another transaction to
create datapath binding for the same logical switch.

Yes, this is a possibility.
However, in reality, this is usually not a problem:

1) If DB schema has table keys properly defined, the redundant transaction from 
clients would be rejected by DB server because of key constraint check. In the 
datapath binding case, this doesn't work because of the poor definition of the 
datapath_binding table. It should have had "logical_switch_router" column 
defined and set as a key (in addition to the "tunnel_key") instead of storing 
it in external_ids. The duplicated entries would have been avoided. The other 
tables such as port_binding would never have such problem.

2) OVSDB clients usually monitors and syncs all (interested) data from server 
to local, so when they do declarative processing, they could correct problems 
by themselves. In fact, ovn-northd does the check and deletes duplicated 
datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired. This 
ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding 
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate 
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be 
something wrong. Please collect northd logs if you encounter this again so we 
can dig further.

I see two ways to improve it.
1) On client side, if the connection is broken while waiting for the result
   of a transaction, the client checks the transaction state, committed or not,
   when it reconnects to the leader (maybe a different node).
   Do we have such check today?

Clients does check. In this case when transaction was actually successful but 
appears to be failed from client point of view, the check doesn't help.

2) I see client connection is dropped by the leader when it's busy. I don't
   

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Hi,

I have another thought after some diggings. Since I am with
OpenStack, all networking configurations are from OpenStack.
I could snapshot OpenStack MariaDB, restore and run
neutron-ovn-db-sync to update OVN DB. Would that be a cleaner
solution?

BTW, I got this error when restore the OVN DB.
ovsdb-client: ovsdb error: /dev/stdin: cannot identify file type

The file was created by "backup" command.


Thanks!

Tony

From: Tony Liu
Sent: Thursday, July 30, 2020 3:41 PM
To: Han Zhou
Cc: Han Zhou; ovs-dev; 
ovs-discuss
Subject: Re: [ovs-dev] [ovs-discuss] [OVN] DB backup and restore

Hi,

A quick question here. Given this man page.
http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt

It says backup and restore commands are for OVSDB standalone and

active-backup databases.



Can they be used for RAFT cluster? If not, what would be the concern,

like inconsistency?



If I restore to a follower, is the request going to be forwarded to the

leader to restore DB for the whole cluster? But I believe it's recommended

to restore to the leader directly for performance sake.



I am going to give it a try anyways, see how it works. Will make sure

there is no configuration update from OpenStack side while running such

snapshot and restore process.





Thanks!



Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 12:23 PM
To: Tony Liu
Cc: Han Zhou; 
ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Thu, Jul 30, 2020 at 10:56 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits it but before the client receives the result.

I saw duplicated datapath bindings for the same logical switch once, if you
recall. This may explain that. The ovn-northd connection to sb-db is dropped
before receiving the result. So ovn-northd initiates another transaction to
create datapath binding for the same logical switch.

Yes, this is a possibility.
However, in reality, this is usually not a problem:

1) If DB schema has table keys properly defined, the redundant transaction from 
clients would be rejected by DB server because of key constraint check. In the 
datapath binding case, this doesn't work because of the poor definition of the 
datapath_binding table. It should have had "logical_switch_router" column 
defined and set as a key (in addition to the "tunnel_key") instead of storing 
it in external_ids. The duplicated entries would have been avoided. The other 
tables such as port_binding would never have such problem.

2) OVSDB clients usually monitors and syncs all (interested) data from server 
to local, so when they do declarative processing, they could correct problems 
by themselves. In fact, ovn-northd does the check and deletes duplicated 
datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired. This 
ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding 
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate 
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be 
something wrong. Please collect northd logs if you encounter this again so we 
can dig further.

I see two ways to improve it.
1) On client side, if the connection is broken while waiting for the result
   of a transaction, the client checks the transaction state, committed or not,
   when it reconnects to the leader (maybe a different node).
   Do we have such check today?

Clients does check. In this case when transaction was actually successful but 
appears to be failed from client point of view, the check doesn't help.

2) I see client connection is dropped by the leader when it's busy. I don't
   think this is a good way to control the traffic. The server can cache and
   hold the request when it's busy, or even push back. Dropping connection
   is not a good option. Any thoughts here?

The server doesn't make this kind of decisions. It could be simply overloaded 
and disconnected from the cluster, or even worse, a node could crash after 
commiting the transaction.

Thanks,
Han


Thanks!

Tony

From: Han Zhou
Sent: Wednesday, July 29, 2020 11:38 PM
To: Tony Liu
Cc: ovs-discuss; 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Hi,

A quick question here. Given this man page.
http://www.openvswitch.org/support/dist-docs/ovsdb-client.1.txt

It says backup and restore commands are for OVSDB standalone and

active-backup databases.



Can they be used for RAFT cluster? If not, what would be the concern,

like inconsistency?



If I restore to a follower, is the request going to be forwarded to the

leader to restore DB for the whole cluster? But I believe it's recommended

to restore to the leader directly for performance sake.



I am going to give it a try anyways, see how it works. Will make sure

there is no configuration update from OpenStack side while running such

snapshot and restore process.





Thanks!



Tony

From: Han Zhou
Sent: Thursday, July 30, 2020 12:23 PM
To: Tony Liu
Cc: Han Zhou; 
ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Thu, Jul 30, 2020 at 10:56 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits it but before the client receives the result.

I saw duplicated datapath bindings for the same logical switch once, if you
recall. This may explain that. The ovn-northd connection to sb-db is dropped
before receiving the result. So ovn-northd initiates another transaction to
create datapath binding for the same logical switch.

Yes, this is a possibility.
However, in reality, this is usually not a problem:

1) If DB schema has table keys properly defined, the redundant transaction from 
clients would be rejected by DB server because of key constraint check. In the 
datapath binding case, this doesn't work because of the poor definition of the 
datapath_binding table. It should have had "logical_switch_router" column 
defined and set as a key (in addition to the "tunnel_key") instead of storing 
it in external_ids. The duplicated entries would have been avoided. The other 
tables such as port_binding would never have such problem.

2) OVSDB clients usually monitors and syncs all (interested) data from server 
to local, so when they do declarative processing, they could correct problems 
by themselves. In fact, ovn-northd does the check and deletes duplicated 
datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired. This 
ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding 
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate 
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be 
something wrong. Please collect northd logs if you encounter this again so we 
can dig further.

I see two ways to improve it.
1) On client side, if the connection is broken while waiting for the result
   of a transaction, the client checks the transaction state, committed or not,
   when it reconnects to the leader (maybe a different node).
   Do we have such check today?

Clients does check. In this case when transaction was actually successful but 
appears to be failed from client point of view, the check doesn't help.

2) I see client connection is dropped by the leader when it's busy. I don't
   think this is a good way to control the traffic. The server can cache and
   hold the request when it's busy, or even push back. Dropping connection
   is not a good option. Any thoughts here?

The server doesn't make this kind of decisions. It could be simply overloaded 
and disconnected from the cluster, or even worse, a node could crash after 
commiting the transaction.

Thanks,
Han


Thanks!

Tony

From: Han Zhou
Sent: Wednesday, July 29, 2020 11:38 PM
To: Tony Liu
Cc: ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Wed, Jul 29, 2020 at 10:58 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
>
> Hi,
>
>
>
> There is any guidance to backup and restore OVN nb-db and sb-db?
>
>
>
> Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file?
>
>
>
> For 3-node DB cluster, is replication 3 (the data is replicated onto
>
> All 3 nodes)?
>
>
>
> Are DB files on 3 nodes identical?
>
>
>
> If I stop a DB follower and empty the DB file on the follower node,
>
> when I start it back, is the whole DB going to be replicated to it?
>
>
>
> To backup the DB, is it OK to copy the DB file from any node, assuming
>
> no transaction ongoing?
>
>
>
> Is the following going to 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Han Zhou
On Thu, Jul 30, 2020 at 10:56 AM Tony Liu  wrote:

> Hi Han,
>
>
>
> That doc helps. I will run some tests and update here. The use case I want
>
> to cover is snapshot/rollback and backup/restore.
>
>
>
> 
>
> Actually, "at-least-once" consistency, because OVSDB does not have a
> session
>
> mechanism to drop duplicate transactions if a connection drops after the
> server
>
> commits it but before the client receives the result.
>
> 
>
> I saw duplicated datapath bindings for the same logical switch once, if you
>
> recall. This may explain that. The ovn-northd connection to sb-db is
> dropped
>
> before receiving the result. So ovn-northd initiates another transaction to
>
> create datapath binding for the same logical switch.
>
>
>
Yes, this is a possibility.
However, in reality, this is usually not a problem:

1) If DB schema has table keys properly defined, the redundant transaction
from clients would be rejected by DB server because of key constraint
check. In the datapath binding case, this doesn't work because of the poor
definition of the datapath_binding table. It should have had
"logical_switch_router" column defined and set as a key (in addition to the
"tunnel_key") instead of storing it in external_ids. The duplicated entries
would have been avoided. The other tables such as port_binding would never
have such problem.

2) OVSDB clients usually monitors and syncs all (interested) data from
server to local, so when they do declarative processing, they could correct
problems by themselves. In fact, ovn-northd does the check and deletes
duplicated datapaths. I did a simple test and it did cleanup by itself:
2020-07-30T18:55:53.057Z|6|ovn_northd|INFO|ovn-northd lock acquired.
This ovn-northd instance is now active.
2020-07-30T19:02:10.465Z|7|ovn_northd|INFO|deleting Datapath_Binding
abef9503-445e-4a52-ae88-4c826cbad9d6 with duplicate
external-ids:logical-switch/router ee80c38b-2016-4cbc-9437-f73e3a59369e

I am not sure why in your case north was stuck, but I agree there must be
something wrong. Please collect northd logs if you encounter this again so
we can dig further.

I see two ways to improve it.
>
> 1) On client side, if the connection is broken while waiting for the result
>
>of a transaction, the client checks the transaction state, committed or
> not,
>
>when it reconnects to the leader (maybe a different node).
>
>Do we have such check today?
>

Clients does check. In this case when transaction was actually successful
but appears to be failed from client point of view, the check doesn't help.


> 2) I see client connection is dropped by the leader when it's busy. I don't
>
>think this is a good way to control the traffic. The server can cache
> and
>
>hold the request when it's busy, or even push back. Dropping connection
>
>is not a good option. Any thoughts here?
>
>
>
The server doesn't make this kind of decisions. It could be simply
overloaded and disconnected from the cluster, or even worse, a node could
crash after commiting the transaction.

Thanks,
Han


>
> Thanks!
>
>
>
> Tony
>
>
>
> *From: *Han Zhou 
> *Sent: *Wednesday, July 29, 2020 11:38 PM
> *To: *Tony Liu 
> *Cc: *ovs-discuss ; ovs-dev
> 
> *Subject: *Re: [ovs-discuss] [OVN] DB backup and restore
>
>
>
>
>
> On Wed, Jul 29, 2020 at 10:58 PM Tony Liu  wrote:
> >
> > Hi,
> >
> >
> >
> > There is any guidance to backup and restore OVN nb-db and sb-db?
> >
> >
> >
> > Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file?
> >
> >
> >
> > For 3-node DB cluster, is replication 3 (the data is replicated onto
> >
> > All 3 nodes)?
> >
> >
> >
> > Are DB files on 3 nodes identical?
> >
> >
> >
> > If I stop a DB follower and empty the DB file on the follower node,
> >
> > when I start it back, is the whole DB going to be replicated to it?
> >
> >
> >
> > To backup the DB, is it OK to copy the DB file from any node, assuming
> >
> > no transaction ongoing?
> >
> >
> >
> > Is the following going to work to restore the DB?
> >
> > * Stop all 3 DBs.
> >
> > * Copy backup DB file to one node, empty DB file on the rest two nodes.
> >
> > * Bootstrap the node with DB file.
> >
> > * Start the rest two nodes to join the cluster.
>
> >
>
>
>
> For ovsdb operations, please refer to "man 7 ovsdb", or here:
> https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst
>
>
>
> >
> >
> > Do I need to restore sb-db as well? Or restore nb-db only and let
> >
> > ovn-northd to sync data from nb-db to sb-db. Chassis data should be
> >
> > updated by onv-controller?
>
> >
>
>
>
> You don't have to restore sb-db. ovn-northd and ovn-controllers will sync
> the data in SB DB.
>
> However, it may take quite some time to sync if the scale is large.
>
> Also, remember that the mac_binding table in SB will not be restored by
> ovn-controller because it is populated as a result of ARP packets handling
> by ovn-controller. The entries will be generated again only if new ARP
> 

Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Tony Liu
Hi Han,

That doc helps. I will run some tests and update here. The use case I want
to cover is snapshot/rollback and backup/restore.


Actually, "at-least-once" consistency, because OVSDB does not have a session
mechanism to drop duplicate transactions if a connection drops after the server
commits it but before the client receives the result.

I saw duplicated datapath bindings for the same logical switch once, if you
recall. This may explain that. The ovn-northd connection to sb-db is dropped
before receiving the result. So ovn-northd initiates another transaction to
create datapath binding for the same logical switch.

I see two ways to improve it.
1) On client side, if the connection is broken while waiting for the result
   of a transaction, the client checks the transaction state, committed or not,
   when it reconnects to the leader (maybe a different node).
   Do we have such check today?
2) I see client connection is dropped by the leader when it's busy. I don't
   think this is a good way to control the traffic. The server can cache and
   hold the request when it's busy, or even push back. Dropping connection
   is not a good option. Any thoughts here?


Thanks!

Tony

From: Han Zhou
Sent: Wednesday, July 29, 2020 11:38 PM
To: Tony Liu
Cc: ovs-discuss; 
ovs-dev
Subject: Re: [ovs-discuss] [OVN] DB backup and restore



On Wed, Jul 29, 2020 at 10:58 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
>
> Hi,
>
>
>
> There is any guidance to backup and restore OVN nb-db and sb-db?
>
>
>
> Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file?
>
>
>
> For 3-node DB cluster, is replication 3 (the data is replicated onto
>
> All 3 nodes)?
>
>
>
> Are DB files on 3 nodes identical?
>
>
>
> If I stop a DB follower and empty the DB file on the follower node,
>
> when I start it back, is the whole DB going to be replicated to it?
>
>
>
> To backup the DB, is it OK to copy the DB file from any node, assuming
>
> no transaction ongoing?
>
>
>
> Is the following going to work to restore the DB?
>
> * Stop all 3 DBs.
>
> * Copy backup DB file to one node, empty DB file on the rest two nodes.
>
> * Bootstrap the node with DB file.
>
> * Start the rest two nodes to join the cluster.
>

For ovsdb operations, please refer to "man 7 ovsdb", or here: 
https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst

>
>
> Do I need to restore sb-db as well? Or restore nb-db only and let
>
> ovn-northd to sync data from nb-db to sb-db. Chassis data should be
>
> updated by onv-controller?
>

You don't have to restore sb-db. ovn-northd and ovn-controllers will sync the 
data in SB DB.
However, it may take quite some time to sync if the scale is large.
Also, remember that the mac_binding table in SB will not be restored by 
ovn-controller because it is populated as a result of ARP packets handling by 
ovn-controller. The entries will be generated again only if new ARP packets are 
observed by ovn-controller.

>
>
> I am running scaling test. It takes quite a lot of time to build
>
> Configurations. Wondering if I can back and restore DB to rollback
>
> to some checkpoint to avoid restart all over.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Display OpenFlow port number, interface name and bridge name in single OvS CLI command

2020-07-30 Thread Matteo Olivi
Thanks.

That indeed works, but I was looking for a single ovs command.
Is there a way to get what I want with just one command?

Regards,
Matteo.

On Tue, Jul 28, 2020 at 10:05 PM Tony Liu  wrote:

>
> for p in $(ovs-vsctl list-ports br-int); do \
> ovs-vsctl -f table --columns=ofport,name list interface $p; \
> done
>
> Tony
> --
> *From:* discuss  on behalf of Matteo
> Olivi 
> *Sent:* July 28, 2020 11:14 AM
> *To:* ovs-discuss@openvswitch.org 
> *Subject:* [ovs-discuss] Display OpenFlow port number, interface name and
> bridge name in single OvS CLI command
>
> Hello everyone,
> I have an OvS bridge *X *and some network interfaces connected to it via
> OpenFlow ports.
> For each interface connected to *X*, I want to display the name and the
> number of its OpenFlow port.
> I've been using the following command:
> "ovs-vsctl -f table -- --columns=ofport,name list Interface"
>
> The problem with the command above is that it lists the interface name and
> OpenFlow port
> number for the interfaces connected to all the bridges on the host, while
> I only want the interfaces
> connected to bridge *X*. Is there a single command to obtain what I need?
> Or, is there a single
> database table where rows store the information that I need, i.e. the
> triplet (bridge name, interface name, OpenFlow port number) ?
>
> Thanks,
> Matteo.
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] DB backup and restore

2020-07-30 Thread Han Zhou
On Wed, Jul 29, 2020 at 10:58 PM Tony Liu  wrote:
>
> Hi,
>
>
>
> There is any guidance to backup and restore OVN nb-db and sb-db?
>
>
>
> Is /var/lib/openvswitch/ovn-[ns]b/ovn[ns]b.db the only database file?
>
>
>
> For 3-node DB cluster, is replication 3 (the data is replicated onto
>
> All 3 nodes)?
>
>
>
> Are DB files on 3 nodes identical?
>
>
>
> If I stop a DB follower and empty the DB file on the follower node,
>
> when I start it back, is the whole DB going to be replicated to it?
>
>
>
> To backup the DB, is it OK to copy the DB file from any node, assuming
>
> no transaction ongoing?
>
>
>
> Is the following going to work to restore the DB?
>
> * Stop all 3 DBs.
>
> * Copy backup DB file to one node, empty DB file on the rest two nodes.
>
> * Bootstrap the node with DB file.
>
> * Start the rest two nodes to join the cluster.
>

For ovsdb operations, please refer to "man 7 ovsdb", or here:
https://github.com/openvswitch/ovs/blob/master/Documentation/ref/ovsdb.7.rst

>
>
> Do I need to restore sb-db as well? Or restore nb-db only and let
>
> ovn-northd to sync data from nb-db to sb-db. Chassis data should be
>
> updated by onv-controller?
>

You don't have to restore sb-db. ovn-northd and ovn-controllers will sync
the data in SB DB.
However, it may take quite some time to sync if the scale is large.
Also, remember that the mac_binding table in SB will not be restored by
ovn-controller because it is populated as a result of ARP packets handling
by ovn-controller. The entries will be generated again only if new ARP
packets are observed by ovn-controller.

>
>
> I am running scaling test. It takes quite a lot of time to build
>
> Configurations. Wondering if I can back and restore DB to rollback
>
> to some checkpoint to avoid restart all over.
>
>
>
>
>
> Thanks!
>
>
>
> Tony
>
>
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss