Re: [ovs-discuss] [ovs-dev] [OVN] ovn-northd HA

2020-08-01 Thread Han Zhou
Hi Tony,

Please find my answers inlined.

On Sat, Aug 1, 2020 at 5:55 PM Tony Liu  wrote:

> When I restore 4096 LS, 4354 LSP, 256 LR and 256 LRP, (I clean up
> all DBs before restore.) it takes a few seconds to restore the nb-db.
> But onv-northd takes forever to update sb-db.
>
> I changed sb-db election timer from 1s to 10s. Then it takes just a
> few minutes for sb-db to get fully synced.
>
> How does that sb-db leader switch affect such sync?
>
> Most likely it is because SB-DB was busy and resulted in time out for the
RAFT election, and kept doing leader election (it can be confirmed by
checking the "term" number), thus never got synced. When you change the
time to 10s, it could complete the work without leader flapping.


>
> Thanks!
>
> Tony
>
> > -Original Message-
> > From: dev  On Behalf Of Tony Liu
> > Sent: Saturday, August 1, 2020 5:26 PM
> > To: ovs-discuss ; ovs-dev  > d...@openvswitch.org>
> > Subject: [ovs-dev] [OVN] ovn-northd HA
> >
> > Hi,
> >
> > I have a few questions about ovn-northd HA.
> >
> > Does the lock for active ovn-northd have to be acquired from the leader
> > of sb-db?
>

Yes, because ovn-northd sets "leader_only" to true for the connection. I
remember it is also required that all lock participants must connect to the
leader for OVSDB lock to work properly.


> >
> > If ovn-northd didn't acquire the lock, it becomes standby. Does it keep
> > trying to acquire the lock, or wait for notification, or monitor the
> > active ovn-northd?
>

It is based on OVSDB notification.

>
> > If it keeps trying, what's the period?
> >
> > Say the active ovn-northd is down, the connection to sb-db is down, sb-
> > db releases the lock, so another ovn-northd can acquire it.
> > Is that correct?
> >
>
Yes

> When sb-db is busy, the connection from ovn-northd is dropped. Not sure
> > from which side it's dropped. And that triggers active ovn-northd switch.
> > Is that right?
> >
>
It is possible, but the same northd may get the lock again, if it is lucky.


> > In case that sb-db leader switchs, is that going to cause active ovn-
> > northd switch as well?
> >
>
It is possible, but the same northd may get the lock again, if it is lucky.


> > For whatever reason, in case active ovn-northd switches, is the new
> > active ovn-northd going to continue the work left by the previous leader,
> > or start all over again?
> >
>

Even for the same ovn-northd, it always recompute everything as a response
to any change. So during switch over, the new active ovn-northd doesn't
need to "continue" - it just recompute everything as usual.
When incremental processing is implemented, the new active ovn-northd may
need to do a recompute first and then handle the further changes
incrementally.
In either cases, there is no need to "continue the work left by the
previous leader".

Thanks,
Han

>
> > Thanks!
> >
> > Tony
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] ovn-northd takes much CPU when no configuration update

2020-08-01 Thread Han Zhou
On Fri, Jul 31, 2020 at 4:14 PM Tony Liu  wrote:

> Hi,
>
> I see the active ovn-northd takes much CPU (30% - 100%) when there is no
> configuration from OpenStack, nothing happening on all chassis nodes
> either.
>
> Is this expected? What is it busy with?
>
>
Yes, this is expected. It is due to the OVSDB probe between ovn-northd and
NB/SB OVSDB servers, which is used to detect the OVSDB connection failure.
Usually this is not a concern (unlike the probe with a large number of
ovn-controller clients), because ovn-northd is a centralized component and
the CPU cost when there is no configuration change doesn't matter that
much. However, if it is a concern, the probe interval (default 5 sec) can
be changed.
If you change, remember to change on both server side and client side.
For client side (ovn-northd), it is configured in the NB DB's NB_Global
table's options:northd_probe_interval. See man page of ovn-nb(5).
For server side (NB and SB), it is configured in the NB and SB DB's
Connection table's inactivity_probe column.

Thanks,
Han


> 
> 2020-07-31T23:08:09.511Z|04267|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 8 (10.6.20.84:44358<->10.6.20.84:6641) at lib/stream-fd.c:157 (68% CPU
> usage)
> 2020-07-31T23:08:09.512Z|04268|jsonrpc|DBG|tcp:10.6.20.84:6641: received
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:09.512Z|04269|jsonrpc|DBG|tcp:10.6.20.84:6641: send
> reply, result=[], id="echo"
> 2020-07-31T23:08:12.777Z|04270|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 9 (10.6.20.84:49158<->10.6.20.85:6642) at lib/stream-fd.c:157 (34% CPU
> usage)
> 2020-07-31T23:08:12.777Z|04271|reconnect|DBG|tcp:10.6.20.85:6642: idle
> 5002 ms, sending inactivity probe
> 2020-07-31T23:08:12.777Z|04272|reconnect|DBG|tcp:10.6.20.85:6642:
> entering IDLE
> 2020-07-31T23:08:12.777Z|04273|jsonrpc|DBG|tcp:10.6.20.85:6642: send
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:12.777Z|04274|jsonrpc|DBG|tcp:10.6.20.85:6642: received
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:12.777Z|04275|reconnect|DBG|tcp:10.6.20.85:6642:
> entering ACTIVE
> 2020-07-31T23:08:12.777Z|04276|jsonrpc|DBG|tcp:10.6.20.85:6642: send
> reply, result=[], id="echo"
> 2020-07-31T23:08:13.635Z|04277|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 9 (10.6.20.84:49158<->10.6.20.85:6642) at lib/stream-fd.c:157 (34% CPU
> usage)
> 2020-07-31T23:08:13.635Z|04278|jsonrpc|DBG|tcp:10.6.20.85:6642: received
> reply, result=[], id="echo"
> 2020-07-31T23:08:14.480Z|04279|hmap|DBG|Dropped 129 log messages in last 5
> seconds (most recently, 0 seconds ago) due to excessive rate
> 2020-07-31T23:08:14.480Z|04280|hmap|DBG|lib/shash.c:112: 2 buckets with 6+
> nodes, including 2 buckets with 6 nodes (32 nodes total across 32 buckets)
> 2020-07-31T23:08:14.513Z|04281|poll_loop|DBG|wakeup due to 27-ms timeout
> at lib/reconnect.c:643 (34% CPU usage)
> 2020-07-31T23:08:14.513Z|04282|reconnect|DBG|tcp:10.6.20.84:6641: idle
> 5001 ms, sending inactivity probe
> 2020-07-31T23:08:14.513Z|04283|reconnect|DBG|tcp:10.6.20.84:6641:
> entering IDLE
> 2020-07-31T23:08:14.513Z|04284|jsonrpc|DBG|tcp:10.6.20.84:6641: send
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:15.370Z|04285|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 8 (10.6.20.84:44358<->10.6.20.84:6641) at lib/stream-fd.c:157 (34% CPU
> usage)
> 2020-07-31T23:08:15.370Z|04286|jsonrpc|DBG|tcp:10.6.20.84:6641: received
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:15.370Z|04287|reconnect|DBG|tcp:10.6.20.84:6641:
> entering ACTIVE
> 2020-07-31T23:08:15.370Z|04288|jsonrpc|DBG|tcp:10.6.20.84:6641: send
> reply, result=[], id="echo"
> 2020-07-31T23:08:16.236Z|04289|poll_loop|DBG|wakeup due to 0-ms timeout at
> tcp:10.6.20.84:6641 (100% CPU usage)
> 2020-07-31T23:08:16.236Z|04290|jsonrpc|DBG|tcp:10.6.20.84:6641: received
> reply, result=[], id="echo"
> 2020-07-31T23:08:17.778Z|04291|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 9 (10.6.20.84:49158<->10.6.20.85:6642) at lib/stream-fd.c:157 (100% CPU
> usage)
> 2020-07-31T23:08:17.778Z|04292|jsonrpc|DBG|tcp:10.6.20.85:6642: received
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:17.778Z|04293|jsonrpc|DBG|tcp:10.6.20.85:6642: send
> reply, result=[], id="echo"
> 2020-07-31T23:08:20.372Z|04294|poll_loop|DBG|wakeup due to [POLLIN] on fd
> 8 (10.6.20.84:44358<->10.6.20.84:6641) at lib/stream-fd.c:157 (41% CPU
> usage)
> 2020-07-31T23:08:20.372Z|04295|reconnect|DBG|tcp:10.6.20.84:6641: idle
> 5002 ms, sending inactivity probe
> 2020-07-31T23:08:20.372Z|04296|reconnect|DBG|tcp:10.6.20.84:6641:
> entering IDLE
> 2020-07-31T23:08:20.372Z|04297|jsonrpc|DBG|tcp:10.6.20.84:6641: send
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:20.372Z|04298|jsonrpc|DBG|tcp:10.6.20.84:6641: received
> request, method="echo", params=[], id="echo"
> 2020-07-31T23:08:20.372Z|04299|reconnect|DBG|tcp:10.6.20.84:6641:
> entering ACTIVE
> 

Re: [ovs-discuss] [OVN] ovn-northd HA

2020-08-01 Thread Tony Liu
When I restore 4096 LS, 4354 LSP, 256 LR and 256 LRP, (I clean up
all DBs before restore.) it takes a few seconds to restore the nb-db.
But onv-northd takes forever to update sb-db.

I changed sb-db election timer from 1s to 10s. Then it takes just a
few minutes for sb-db to get fully synced.

How does that sb-db leader switch affect such sync?


Thanks!

Tony

> -Original Message-
> From: dev  On Behalf Of Tony Liu
> Sent: Saturday, August 1, 2020 5:26 PM
> To: ovs-discuss ; ovs-dev  d...@openvswitch.org>
> Subject: [ovs-dev] [OVN] ovn-northd HA
> 
> Hi,
> 
> I have a few questions about ovn-northd HA.
> 
> Does the lock for active ovn-northd have to be acquired from the leader
> of sb-db?
> 
> If ovn-northd didn't acquire the lock, it becomes standby. Does it keep
> trying to acquire the lock, or wait for notification, or monitor the
> active ovn-northd?
> 
> If it keeps trying, what's the period?
> 
> Say the active ovn-northd is down, the connection to sb-db is down, sb-
> db releases the lock, so another ovn-northd can acquire it.
> Is that correct?
> 
> When sb-db is busy, the connection from ovn-northd is dropped. Not sure
> from which side it's dropped. And that triggers active ovn-northd switch.
> Is that right?
> 
> In case that sb-db leader switchs, is that going to cause active ovn-
> northd switch as well?
> 
> For whatever reason, in case active ovn-northd switches, is the new
> active ovn-northd going to continue the work left by the previous leader,
> or start all over again?
> 
> 
> Thanks!
> 
> Tony
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] [OVN] ovn-northd HA

2020-08-01 Thread Tony Liu
Hi,

I have a few questions about ovn-northd HA.

Does the lock for active ovn-northd have to be acquired from the leader
of sb-db?

If ovn-northd didn't acquire the lock, it becomes standby. Does it keep
trying to acquire the lock, or wait for notification, or monitor the
active ovn-northd?

If it keeps trying, what's the period?

Say the active ovn-northd is down, the connection to sb-db is down,
sb-db releases the lock, so another ovn-northd can acquire it.
Is that correct?

When sb-db is busy, the connection from ovn-northd is dropped. Not sure
from which side it's dropped. And that triggers active ovn-northd switch.
Is that right?

In case that sb-db leader switchs, is that going to cause active
ovn-northd switch as well?

For whatever reason, in case active ovn-northd switches, is the new
active ovn-northd going to continue the work left by the previous leader,
or start all over again?


Thanks!

Tony

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] connecting Mininet Topology to Internet

2020-08-01 Thread Luca Mancini
Hello,
I’m trying to connect a mininet topology to the internet so that I can test out 
some actions I created.
I created a simple topology:

H1,h2,h3,h4 ---s1s2---Internet

I did this by adding a bridge with the ovs-vsctl add-port command  to s2 (eth1 
which is what the VM uses to connect to the internet) and then used dhclient on 
each of the hx-eth0 interfaces of the hosts.
Something weird happens where every host connects to the internet (tried both 
ping and links commands) but when I look at the flows that are automatically 
created by ovs, it seems like only one of the hosts (the first one I pinged 
from) is receiving the replies from the internet even though no packets are 
dropped by the other hosts. How is this possible?
Does anyone have a better way to connect a mininet topology to the internet?

Help is very appreciated since I’ve been hitting my head on this issue for a 
couple of days now
Luca

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss