On Tue, Aug 4, 2020 at 9:12 AM Tony Liu <tonyliu0...@hotmail.com> wrote:
> In my deployment, on each Neutron server, there are 13 Neutron server > processes. > I see 12 of them (monitor, maintenance, RPC, API) connect to both ovn-nb-db > and ovn-sb-db. With 3 Neutron server nodes, that's 36 OVSDB clients. > Is so many clients OK? > > Any suggestions how to figure out which side doesn't respond the probe, > if it's bi-directional? I don't see any activities from logging, other than > connect/drop and reconnect... > > BTW, please let me know if this is not the right place to discuss Neutron > OVN > ML2 driver. > > > Thanks! > > Tony > > > -----Original Message----- > > From: dev <ovs-dev-boun...@openvswitch.org> On Behalf Of Tony Liu > > Sent: Monday, August 3, 2020 7:45 PM > > To: ovs-discuss <ovs-discuss@openvswitch.org>; ovs-dev <ovs- > > d...@openvswitch.org> > > Subject: [ovs-dev] [OVN] no response to inactivity probe > > > > Hi, > > > > Neutron OVN ML2 driver was disconnected by ovn-nb-db. There are many > > error messages from ovn-nb-db leader. > > ======== > > 2020-08-04T02:31:39.751Z|03138|reconnect|ERR|tcp:10.6.20.81:58620: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:42.484Z|03139|reconnect|ERR|tcp:10.6.20.81:58300: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:49.858Z|03140|reconnect|ERR|tcp:10.6.20.81:59582: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:53.057Z|03141|reconnect|ERR|tcp:10.6.20.83:42626: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:53.058Z|03142|reconnect|ERR|tcp:10.6.20.82:45412: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:54.067Z|03143|reconnect|ERR|tcp:10.6.20.81:59416: no > > response to inactivity probe after 5 seconds, disconnecting > > 2020-08-04T02:31:54.809Z|03144|reconnect|ERR|tcp:10.6.20.81:60004: no > > response to inactivity probe after 5 seconds, disconnecting ======== > > > > Could anyone share a bit details how this inactivity probe works? > The inactivity probe is sent by both the server and clients independently. Meaning ovsdb-server will send an inactivity probe every 'x' configured seconds to all its connected clients and if it doesn't get a reply from the client within some time, it disconnects the connection. The inactivity probe from the server side can be configured. Run "ovn-nbctl list connection" and you will see inactivity_probe column. You can set this column to desired value like - ovn-nbctl set connection . inactivity_probe=30000 (for 30 seconds) The same thing for SB ovsdb-server. Similarly each client (ovn-northd, ovn-controller, neutron server) sends inactivity probe every 'y' seconds and if the client doesn't get any reply from ovsdb-server it will disconnect the connection and reconnect again. For ovn-northd you can configured this as - ovn-nbctl set NB_Global . options:northd_probe_interval=30000 For ovn-controllers - ovs-vsctl set open . external_ids:ovn-remote-probe-interval=30000 There is also a probe interval for openflow connection from ovn-controller to ovs-vswitchd which you can configure as ovs-vsctl set open . external_ids:ovn-openflow-probe-interval=30 (this is in seconds) Regarding the neutron server I think it is set to 60 seconds. Please see this - https://github.com/openstack/neutron/blob/master/neutron/conf/plugins/ml2/drivers/ovn/ovn_conf.py#L80 >From the logs you shared, it looks like ovsdb-server is not getting the probe reply from neutron server after 5 seconds and hence it is disconnecting. Not sure what's happening though. You can try increasing the inactivity probe interval on the ovsdb-server side with the first command I shared. Note: If "ovn-nbctl list connection" returns empty, you need to create a connection row like - ovn-nbctl set-connection ptcp:6641:<IP> Thanks Numan > From OVN ML2 driver log, I see it connected to the leader, then the > > connection was closed by leader after 5 or 6 seconds. Is this probe one- > > way or two-ways? > > Both sides are not busy, not taking much CPU cycles. Not sure how this > > could happen. Any thoughts? > > > > > > Thanks! > > > > Tony > > > > > > > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss