Re: [ovs-discuss] raft cluster with LB

Dan Williams Mon, 09 May 2022 06:42:38 -0700

On Fri, 2022-05-06 at 16:27 -0300, Tiago Pires wrote:
> Hi all,
> 
> I was checking the mail list history and this
> thread https://mail.openvswitch.org/pipermail/ovs-discuss/2018-March/
> 046438.html caught my attention about raft ovsdb clustering.
> In my setup (OVN 20.03 and Openstack Ussuri) on the ovn-controller we
> have configured the ovn-
> remote="tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:664
> 2" with the 3 OVN central member that they are in cluster mode.
> Also on the neutron ML2 side:
> [ovn]
> ovn_native_dhcp = True
> ovn_nb_connection =
> tcp:10.2X.4X.4:6641,tcp:10.2X.4X.68:6641,tcp:10.2X.4X.132:6641
> ovn_sb_connection =
> tcp:10.2X.4X.4:6642,tcp:10.2X.4X.68:6642,tcp:10.2X.4X.132:6642
> 
> We are experiencing an issue with Neutron when the OVN leader decide
> to take a snapshot and by design another member became leader(more
> less every 8 minutes):
> 2022-05-05T16:57:42.135Z|17401|raft|INFO|Transferring leadership to
> write a snapshot.
> 
> ovs-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
> 4a03
> Name: OVN_Southbound
> Cluster ID: ca74 (ca744caf-40cd-4751-a2f2-86e35ad6541c)
> Server ID: 4a03 (4a0328dc-e9a4-495e-a4f1-0a0340fc6d19)
> Address: tcp:10.2X.4X.132:6644
> Status: cluster member
> Role: leader
> Term: 1912
> Leader: self
> Vote: self
> 
> Election timer: 10000
> Log: [497643, 498261]
> Entries not yet committed: 0
> Entries not yet applied: 0
> Connections: ->3d6c ->4ef0 <-3d6c <-4ef0
> Servers:
>     4a03 (4a03 at tcp:10.2X.4X.132:6644) (self) next_index=497874
> match_index=498260
>     3d6c (3d6c at tcp:10.2X.4X.68:6644) next_index=498261
> match_index=498260
>     4ef0 (4ef0 at tcp:10.2X.4X.4:6644) next_index=498261
> match_index=498260
> 
> As I understood the tcp connections from the Neutron (NB) and ovn-
> controllers (SB) to OVN Central are established only with the leader:


ovn-controller should not be connecting only to the leader; it's
hardcoded not to since late 2017:

    /* Configure OVN SB database. */
    struct ovsdb_idl_loop ovnsb_idl_loop = OVSDB_IDL_LOOP_INITIALIZER(
        ovsdb_idl_create_unconnected(&sbrec_idl_class, true));
    ovsdb_idl_set_leader_only(ovnsb_idl_loop.idl, false);

That's by design, since otherwise half the point of RAFT (the LB part)
is pointless. It doesn't mean that during a snapshot of its connected
DB ovn-controller won't receive any updates but things will catch up
right after.

> 
> #OVN central leader
> $ netstat -nap | grep 6642| more
> 
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*              
> LISTEN      -                   
> tcp        0      0 10.2X.4X.132:6642       10.24.40.17:47278      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.24.40.76:36240      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47280      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.6:43102      
>  ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.75:58890      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.6:43108      
>  ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47142      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.71:48808      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.132:6642       10.2X.4X.17:47096      
> ESTABLISHED -       
> #OVN follower 2
> 
> $ netstat -nap | grep 6642
> 
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*              
> LISTEN      -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.76:57256      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.134:54026    
>  ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.10:34962      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.6:49238      
>  ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.135:59972    
>  ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:6642         10.2X.4X.75:40162      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.4:39566        10.2X.4X.132:6642      
> ESTABLISHED -   
> #OVN follower 3
> 
> netstat -nap | grep 6642 
> 
> tcp        0      0 0.0.0.0:6642            0.0.0.0:*              
> LISTEN      -                   
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.70:40750      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.11:49718      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.68:45632       10.2X.4X.132:6642      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.16:44816      
> ESTABLISHED -                   
> tcp        0      0 10.2X.4X.68:6642        10.2X.4X.7:45216      
>  ESTABLISHED 
> 
> The issue that we are experiencing is on the neutron-server that
> disconnects when there is the ovn leader change (due snapshot like
> each 8 minutes) and reconnects to the next leader. It breaks the
> Openstack API when someone is trying to create a VM at the same time.
> First, is my current configuration correct? Should the leader change
> and break the neutron side? Or is there some missing configuration?

I don't know anything specific about Neutron's handling of leader
changes, but that's likely an issue with the Python OVN bindings. There
has been a lot of work in the Python bindings recently for RAFT (like
"python: idl: Add monitor_cond_since support." which is especially
important for performance and scale).

Typically the client (Neutron, python bindings) notice the leadership
change (by watching the Server database of it's connected dbserver),
stop outstanding requests, connect to the new leader, get latest
database changes with update3/monitor_cond_since to update their cache,
and then retry outstanding requests after checking the cache to make
sure they are still required.

There will likely be a short time (< 1s) during which operations will
be queued while the bindings handle the reconnection, but there
shouldn't be any failures of operations if the CMS and bindings are
handling the transition correctly.

Note that OVS is important here too. Ilya did a *ton* of perf/scale
enhancements to ovsdb-server in 2.16 and 2.17 that will reduce ovsdb-
server memory usage, CPU usage, and snapshot time. The python
IDL/bindings have gotten a ton of love in the past year or two as well.

> I was wondering if it is possible to use a LB with VIP and this VIP
> balance the connections to the ovn central members and I would
> reconfigure on the neutron side only with the VIP and also on the
> ovs-controllers. Does that make sense?

IMO that would attempt to work around a problem that likely shouldn't
be happening.

Dan

> 
> Thank you.
> 
> Regards,
> 
> Tiago Pires
> 
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Re: [ovs-discuss] raft cluster with LB

Reply via email to