[ovs-discuss] OVN nb-db and sb-db election timer

2020-07-27 Thread Tony Liu
Hi,

During scaling test, when sb-db is busy, followers believe the leader is dead 
and started election
request. Some inconsistency happens during such leader switch. Two datapath 
bindings are created
for the same logical switch. To avoid such case, I was recommended to increase 
election timer x10.
4K networks are created successfully with that setting.

Is it necessary to set big election timer for nb-db as well? The nb-db doesn't 
seem very busy during
the test, sb-db is always busy and taking 90+% CPU.

With that big election timer, in case real problem happens, like the leader 
node goes down, is it going
to take a while for the new leader to be elected?


Thanks!

Tony

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] What is the most efficient way to add many ports to an OVS bridge ?

2020-07-27 Thread Ben Pfaff
On Fri, Jul 24, 2020 at 01:17:36PM +0200, Paul HENG wrote:
> I'm working on a project that needs to add a lot of ports to OVS bridges
> (from 10 to potentially over 1K ports) and I'm wondering what is the most
> time efficient way to add them to the different bridges. At the moment I'm
> using a single ovs-vsctl command to apply all the add-port commands, but are
> there any alternatives that might be interesting ? I'm calling ovs-vsctl
> from a Golang program so I was also wondering if ovs-vsctl can be used
> safely by multiple threads at the same time or if it should be avoided.

I think that there's a Go language binding to the OVSDB protocol.  If
so, it's probably faster than to use a separate ovs-vsctl process.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN scale

2020-07-27 Thread Tony Liu
Hi Han,

Just some updates here.

I tried with 4K networks on single router. Configuration was done without any 
issues. I checked both
nb-db and sb-db, they all look good. It's just that router configuration is 
huge (in Neutron DB, nb-db
and flow table in sb-db), because it contains all 4K ports. Also, the pipeline 
of router datapath in sb-db
is quite big.

I see ovn-northd master and sb-db leader are busy, taking 90+% CPU. There are 
only 3 compute nodes
and 2 gateway nodes. Does that monitor setting "ovn-monitor-all" matters in 
such case? Any idea what
they are busy with, without any configuration updates from OpenStack? The nb-db 
is not busy though.

Probably because nb-db is busy, ovn-controller can't connect to it 
consistently. It keeps being
disconnected and reconnecting. Restarting ovn-controller seems help. I am able 
to launch a few VMs
on different networks and they are connected via the router.

Now, I have problem on external access. The router is set as gateway to a 
provider/underlay network
on an interface on the gateway node. The router is allocated an underlay 
address from that provider
network. My understanding is that, the br-ex on gateway node holding the active 
router will broadcast
ARP to announce that router underlay address in case of failover. Also, it will 
respond ARP request for
that router underlay address. But when I run tcpdump on that underlay interface 
on gateway node,
I see ARP request coming in, but no ARP response going out. I checked the flow 
table in sb-db, it seems
ok. I also checked flow on br-ex by "ovs-ofctl dump-flows br-ex", I don't see 
anything about ARP there.
How should I look into it?

Again, the case is to support 4K networks with external access (security group 
is disabled),
4K routers (one for each network), 50 routers (one for 80 networks), 1 router 
(for all 4K networks)...
All networks are isolated by ACL on the logical router. Which option should 
work better?
Any comment is appreciated.


Thanks!

Tony



From: discuss  on behalf of Tony Liu 

Sent: July 21, 2020 09:09 PM
To: Daniel Alvarez 
Cc: ovs-discuss@openvswitch.org 
Subject: Re: [ovs-discuss] OVN scale

[root@ovn-db-2 ~]# ovn-nbctl list nb_global
_uuid   : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
connections : []
external_ids: {"neutron:liveness_check_at"="2020-07-22 
04:03:17.726917+00:00"}
hv_cfg  : 312
ipsec   : false
name: ""
nb_cfg  : 2636
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
sb_cfg  : 2005
ssl : []

[root@ovn-db-2 ~]# ovn-sbctl list sb_global
_uuid   : 3720bc1d-b0da-47ce-85ca-96fa8d398489
connections : []
external_ids: {}
ipsec   : false
nb_cfg  : 312
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
ssl : []

The NBDB and SBDB is definitely out of sync. Is there any way to force 
ovn-northd sync them?

Thanks!

Tony


From: Tony Liu 
Sent: July 21, 2020 08:39 PM
To: Daniel Alvarez 
Cc: Cory Hawkless ; ovs-discuss@openvswitch.org 
; Dumitru Ceara 
Subject: Re: [ovs-discuss] OVN scale

When create a network (and subnet) on OpenStack, a GW port and service port 
(for DHCP and metadata)
are also created. They are created in Neutron and onv-nb-db by ML2 driver. Then 
ovn-northd will translate
such update from NBDB to SBDB. My question here is that, with 20.03, is this 
translation incremental?

After created 4000 networks successfully on OpenStack, I see 4000 logical 
switches and 8000 LS ports
in NBDB. But in SBDB, there are only 1567 port-bindings. The break happened 
when translating 1568th
port. If ovn-northd recompiles the whole DB for every update, this problem can 
be explained. The DB is
too big for ovn-northd to compile in time, so all the followed updates are 
lost. Does it make sense?

I recall DB update is coordinated by some "version", like some changes happened 
in NBDB, the version
bumps up, ovn-northd update SBDB and bumps up version as well, so they match. 
So, if NBDB version
bumps up more than once while ovn-northd updating SBDB, is that still going to 
work? If yes, then it's
just matter of time, no matter how fast update happening in NBDB, ovn-northd 
will catch them up
eventually. Am I right about that?

Any comment is welcome.


Thanks!

Tony



From: Tony Liu 
Sent: July 21, 2020 10:22 AM
To: Daniel Alvarez 
Cc: Cory Hawkless ; ovs-discuss@openvswitch.org 
; Dumitru Ceara 
Subject: Re: [ovs-discuss] OVN scale

Hi Daniel, all

4000 networks and 50 routers, 200 networks on each router, they are all created.
CPU usage of Neutron server, ovn-nb-db, ovn-northd, ovn-sb-db, ovn-controller 
and ovs-vswitchd is OK,
not consistently 100%, but still some spikes to it.

Now, when create VM, I got that "waiting for vif-plugged