Re: [ovs-discuss] OVN scale

Tony Liu Tue, 04 Aug 2020 19:55:12 -0700

Hi,

Continue this thread with some updates.


I finally got 4096 networks and 256 router created, 16 networks connecting
to each router. All routers are set as external gateway.

On underlay, those 256 gateway addresses on the provider network are
reachable. Ping is steady.

I launched 10 VMs on one compute node. One of them failed because network
allocation failed. Didn't look into it.

When ping from underlay to VM, it's bumpy. There is 1s or 2s delay about
every 10 pings.

Can't launch any more VMs. It always fails.

One of the Neutron node is very busy. From the logging on INFO level,
it just keeps connecting to OVN.

The active ovn-northd is busy, but all ovn-nb-db and ovn-sb-db are not.

On compute node, ovn-controller is very busy. It keeps saying
"commit failed".
====================
2020-08-05T02:44:23.927Z|04125|reconnect|INFO|tcp:10.6.20.84:6642: connected
2020-08-05T02:44:23.936Z|04126|main|INFO|OVNSB commit failed, force recompute 
next time.
2020-08-05T02:44:23.938Z|04127|ovsdb_idl|INFO|tcp:10.6.20.84:6642: clustered 
database server is disconnected from cluster; trying another server
2020-08-05T02:44:23.939Z|04128|reconnect|INFO|tcp:10.6.20.84:6642: connection 
attempt timed out
2020-08-05T02:44:23.939Z|04129|reconnect|INFO|tcp:10.6.20.84:6642: waiting 2 
seconds before reconnect
====================

The connection to local OVSDB keeps being dropped, because no probe
response. The probe interval is set to 30s already.
========
2020-08-05T02:47:15.437Z|04351|poll_loop|INFO|wakeup due to [POLLIN] on fd 20 
(10.6.20.22:42362<->10.6.20.86:6642) at lib/stream-fd.c:157 (100% CPU usage)
2020-08-05T02:47:15.438Z|04352|reconnect|WARN|tcp:127.0.0.1:6640: connection 
dropped (Broken pipe)
2020-08-05T02:47:15.438Z|04353|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
 connecting...
2020-08-05T02:47:15.449Z|04354|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
 connected
========

Also error about localnet port.
========
2020-08-05T02:47:15.403Z|04345|patch|ERR|bridge not found for localnet port 
'provnet-006baf64-409d-434d-b95b-017a77969b55' with network name 'physnet1'
========

First of all, this kind of scale should work fine, right?

Any advices how to look into it?


Thanks!

Tony

> -----Original Message-----
> From: dev <ovs-dev-boun...@openvswitch.org> On Behalf Of Tony Liu
> Sent: Monday, July 27, 2020 10:16 AM
> To: Han Zhou <hz...@ovn.org>
> Cc: ovs-...@openvswitch.org; ovs-discuss@openvswitch.org
> Subject: Re: [ovs-dev] [ovs-discuss] OVN scale
> 
> Hi Han,
> 
> Just some updates here.
> 
> I tried with 4K networks on single router. Configuration was done
> without any issues. I checked both nb-db and sb-db, they all look good.
> It's just that router configuration is huge (in Neutron DB, nb-db and
> flow table in sb-db), because it contains all 4K ports. Also, the
> pipeline of router datapath in sb-db is quite big.
> 
> I see ovn-northd master and sb-db leader are busy, taking 90+% CPU.
> There are only 3 compute nodes and 2 gateway nodes. Does that monitor
> setting "ovn-monitor-all" matters in such case? Any idea what they are
> busy with, without any configuration updates from OpenStack? The nb-db
> is not busy though.
> 
> Probably because nb-db is busy, ovn-controller can't connect to it
> consistently. It keeps being disconnected and reconnecting. Restarting
> ovn-controller seems help. I am able to launch a few VMs on different
> networks and they are connected via the router.
> 
> Now, I have problem on external access. The router is set as gateway to
> a provider/underlay network on an interface on the gateway node. The
> router is allocated an underlay address from that provider network. My
> understanding is that, the br-ex on gateway node holding the active
> router will broadcast ARP to announce that router underlay address in
> case of failover. Also, it will respond ARP request for that router
> underlay address. But when I run tcpdump on that underlay interface on
> gateway node, I see ARP request coming in, but no ARP response going out.
> I checked the flow table in sb-db, it seems ok. I also checked flow on
> br-ex by "ovs-ofctl dump-flows br-ex", I don't see anything about ARP
> there.
> How should I look into it?
> 
> Again, the case is to support 4K networks with external access (security
> group is disabled), 4K routers (one for each network), 50 routers (one
> for 80 networks), 1 router (for all 4K networks)...
> All networks are isolated by ACL on the logical router. Which option
> should work better?
> Any comment is appreciated.
> 
> 
> Thanks!
> 
> Tony
> 
> 
> ________________________________
> From: discuss <ovs-discuss-boun...@openvswitch.org> on behalf of Tony
> Liu <tonyliu0...@hotmail.com>
> Sent: July 21, 2020 09:09 PM
> To: Daniel Alvarez <dalva...@redhat.com>
> Cc: ovs-discuss@openvswitch.org <ovs-discuss@openvswitch.org>
> Subject: Re: [ovs-discuss] OVN scale
> 
> [root@ovn-db-2 ~]# ovn-nbctl list nb_global
> _uuid               : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
> connections         : []
> external_ids        : {"neutron:liveness_check_at"="2020-07-22
> 04:03:17.726917+00:00"}
> hv_cfg              : 312
> ipsec               : false
> name                : ""
> nb_cfg              : 2636
> options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> sb_cfg              : 2005
> ssl                 : []
> 
> [root@ovn-db-2 ~]# ovn-sbctl list sb_global
> _uuid               : 3720bc1d-b0da-47ce-85ca-96fa8d398489
> connections         : []
> external_ids        : {}
> ipsec               : false
> nb_cfg              : 312
> options             : {mac_prefix="ca:e8:07",
> svc_monitor_mac="4e:d0:3a:80:d4:b7"}
> ssl                 : []
> 
> The NBDB and SBDB is definitely out of sync. Is there any way to force
> ovn-northd sync them?
> 
> Thanks!
> 
> Tony
> 
> ________________________________
> From: Tony Liu <tonyliu0...@hotmail.com>
> Sent: July 21, 2020 08:39 PM
> To: Daniel Alvarez <dalva...@redhat.com>
> Cc: Cory Hawkless <c...@hawkless.id.au>; ovs-discuss@openvswitch.org
> <ovs-discuss@openvswitch.org>; Dumitru Ceara <dce...@redhat.com>
> Subject: Re: [ovs-discuss] OVN scale
> 
> When create a network (and subnet) on OpenStack, a GW port and service
> port (for DHCP and metadata) are also created. They are created in
> Neutron and onv-nb-db by ML2 driver. Then ovn-northd will translate such
> update from NBDB to SBDB. My question here is that, with 20.03, is this
> translation incremental?
> 
> After created 4000 networks successfully on OpenStack, I see 4000
> logical switches and 8000 LS ports in NBDB. But in SBDB, there are only
> 1567 port-bindings. The break happened when translating 1568th port. If
> ovn-northd recompiles the whole DB for every update, this problem can be
> explained. The DB is too big for ovn-northd to compile in time, so all
> the followed updates are lost. Does it make sense?
> 
> I recall DB update is coordinated by some "version", like some changes
> happened in NBDB, the version bumps up, ovn-northd update SBDB and bumps
> up version as well, so they match. So, if NBDB version bumps up more
> than once while ovn-northd updating SBDB, is that still going to work?
> If yes, then it's just matter of time, no matter how fast update
> happening in NBDB, ovn-northd will catch them up eventually. Am I right
> about that?
> 
> Any comment is welcome.
> 
> 
> Thanks!
> 
> Tony
> 
> 
> ________________________________
> From: Tony Liu <tonyliu0...@hotmail.com>
> Sent: July 21, 2020 10:22 AM
> To: Daniel Alvarez <dalva...@redhat.com>
> Cc: Cory Hawkless <c...@hawkless.id.au>; ovs-discuss@openvswitch.org
> <ovs-discuss@openvswitch.org>; Dumitru Ceara <dce...@redhat.com>
> Subject: Re: [ovs-discuss] OVN scale
> 
> Hi Daniel, all
> 
> 4000 networks and 50 routers, 200 networks on each router, they are all
> created.
> CPU usage of Neutron server, ovn-nb-db, ovn-northd, ovn-sb-db, ovn-
> controller and ovs-vswitchd is OK, not consistently 100%, but still some
> spikes to it.
> 
> Now, when create VM, I got that "waiting for vif-plugged-in timeout".
> This brings out another question, it used to be neutron-agent notifying
> Neutron server port status change, with OVN, who does it?
> How should I look into this?
> 
> Please see my other comments Inline...
> 
> 
> Thanks!
> 
> Tony
> ________________________________
> From: Daniel Alvarez <dalva...@redhat.com>
> Sent: July 21, 2020 12:06 AM
> To: Tony Liu <tonyliu0...@hotmail.com>
> Cc: Cory Hawkless <c...@hawkless.id.au>; ovs-discuss@openvswitch.org
> <ovs-discuss@openvswitch.org>; Dumitru Ceara <dce...@redhat.com>
> Subject: Re: [ovs-discuss] OVN scale
> 
> Hi Tony, all
> 
> 
> 
> On 21 Jul 2020, at 07:53, Tony Liu <tonyliu0...@hotmail.com> wrote:
> 
> 
> Hi Cory,
> 
> With 4000 networks all connecting to one router with external GW, all
> networks and router are created and connected. I launched a few VMs on
> some networks, they are connected and all have external connectivity.
> When running ping on VM, there is a slow ping (a few seconds) out of 10+
> normal pings (< 1ms). When checking CPU usage, I see Neutron server, OVN
> DB, OVN controller and ovs-switchd all take almost 100% CPU. It's been
> like that for hours already.
> Since they are all created and some of them work fine (didn't validate
> all networks), not sure what those services are busy with. Checked log,
> the ovn-controller keep switching between ovn-sb-db, because of
> heartbeat timeout.
> 
> 
> How are you deploying OpenStack and in particular the OVN dbs? Is it
> RAFT cluster?
> 
> > Kolla Ansible. I see cluster-local-address and remote address (to the
> > first node) is specified for all 3 nodes. I assume clustering is
> enabled.
> > Is there different type of cluster?
> 
> What’s your current value for ovn-remote-probe-interval? If it’s too low,
> this can be triggering reconnections all the time and creating a
> snowball effect.
> 
> > external_ids        : {ovn-encap-ip="10.6.30.22", ovn-encap-
> type=geneve, ovn-
> remote="tcp:10.6.20.84:6642,tcp:10.6.20.85:6642,tcp:10.6.20.86:6642",
> ovn-remote-probe-interval="60000", system-id="compute-3"}
> 
> You can bump the probe interval timeout like this:
> 
> ovs-vsctl set open . external_ids:ovn-remote-probe-interval=<TIME IN MS>
> 
> 
> I'd like know if that's expected, or something I can tune to fix the
> problem. If that's expected, I can't think of anything other than
> building multiple clusters to support that kind of scale.
> 
> I am running test with 4000 networks with 50 routers, 80 networks on
> each router. Wondering if that's going to help.
> 
> Reducing the number of routers should help. Also there are some
> improvements in 20.06 release when it comes to the number of logical
> flows by a series of patches from Han. I will post the links later,
> sorry.
> 
> Also there is a big improvement around large Port Groups as they are now
> split by data path reducing dramatically the calculations in ovn-
> controller. Specially in scenarios with a large number of networks like
> yours.
> However you seem to have no security groups and hence no Port Groups in
> the NB database. Is this correct?
> 
> > Yes. For now, I want to avoid scale impact from SG, so I disable it.
> 
> Is there any chance you can re run the initial scenario but with 20.06?
> 
> > Is there container for 20.06? Or where I can get the packages of 20.06?
> >I should be able to upgrade 20.03 to 20.06 by upgrading packages.
> 
> The goal is to have thousands networks connecting to external. I'd like
> to know what's the expected scale supported by current OVN.
> 
> +Dumitru as we know that there is a limit of 3000 in the number of re
> submissions. So having 3K routers connected to the public logical switch
> may hit this limitation. Please @Dumitru correct me if I’m wrong.
> 
> Any comment is welcome.
> 
> 
> Thanks!
> 
> Tony
> 
> ________________________________
> From: Cory Hawkless <c...@hawkless.id.au>
> Sent: July 20, 2020 10:04 PM
> To: Tony Liu <tonyliu0...@hotmail.com>; ovs-discuss@openvswitch.org
> <ovs-discuss@openvswitch.org>
> Subject: RE: OVN scale
> 
> 
> I would expect to see 100% cpu utilisation on anything involved in the
> process of creating 4000 networks and routers but the question is for
> how long do you see high utilisation? Does it last for seconds, minutes,
> hours?
> 
> Do the resources actually get created after some period of time or is
> the process failing?
> 
> 
> 
> From: discuss [mailto:ovs-discuss-boun...@openvswitch.org] On Behalf Of
> Tony Liu
> Sent: Tuesday, 21 July 2020 1:53 PM
> To: ovs-discuss@openvswitch.org
> Subject: [ovs-discuss] OVN scale
> 
> 
> 
> Hi folks,
> 
> 
> 
> This is my first email here. Please let me know if there is any rule
> 
> or convention I need to follow. Don't want to break it.
> 
> 
> 
> I started with OpenStack Ussuri and OVN 20.03.0 recently and currently
> 
> running some scaling test. Searched around for scaling info and noticed
> 
> some improvements already presented, which is pretty cool.
> 
> Wondering that "incremental" by DDlog implemented yet?
> 
> 
> 
> With a 3-node OVN DB cluster and 3 compute nodes (with OVN controller),
> 
> I created 4000 networks from OpenStack, 4000 logical routers with
> 
> external GW, add one network to each LR. Port security is disabled on
> 
> all networks. Then I see ovn-northd, ovn-controller and ovs-switchd all
> 
> take almost 100% CPU. Is this expected?
> 
> 
> 
> I revised solution and running test to have 4000 networks, 20 LRs and
> 
> 200 networks on each LR. Will see if this makes any difference.
> 
> 
> 
> Is there any scaling and performance report with the latest OVN release
> 
> as my reference?
> 
> 
> 
> 
> 
> Thanks!
> 
> 
> 
> Tony
> 
> 
> 
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Re: [ovs-discuss] OVN scale

Reply via email to