Thank you for your reply.

This is Openstack with ml2 plugin. There’s no other 3rd party application used 
with our network, so no OVN or anything of the sort. Essentially, to give a 
quick idea of the topology, we have our vms on our compute nodes going through 
GRE tunnels toward network nodes where they are routed in network namespace 
toward a flat external network.

> Generally, the above indicates that a daemon fronting a Open vSwitch database 
> hasn't been able to connect to its client. Usually happens when CPU 
> consumption is very high.

Our network nodes CPU are literally sleeping. Is openvswitch single-thread or 
multi-thread though? If ovs overloaded a single thread, it’s possible I may 
have missed it.

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 27 sept. 2018 à 14:04, Guru Shetty <[email protected]> a écrit :
> 
> 
> 
> On Wed, 26 Sep 2018 at 12:59, Jean-Philippe Méthot via discuss 
> <[email protected] <mailto:[email protected]>> wrote:
> Hi,
> 
> I’ve been using openvswitch for my networking backend on openstack for 
> several years now. Lately, as our network has grown, we’ve started noticing 
> some intermittent packet drop accompanied with the following error message in 
> openvswitch:
> 
> 2018-09-26T04:15:20.676Z|00005|reconnect|ERR|tcp:127.0.0.1:45928 
> <http://127.0.0.1:45928/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:15:20.677Z|00006|reconnect|ERR|tcp:127.0.0.1:45930 
> <http://127.0.0.1:45930/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 
> Open vSwitch is a project with multiple daemons. Since you are using 
> OpenStack, it is not clear from your message, what type of networking plugin 
> you are using. Do you use OVN?
> Also, you did not mention from which file you have gotten the above errors.
> 
> Generally, the above indicates that a daemon fronting a Open vSwitch database 
> hasn't been able to connect to its client. Usually happens when CPU 
> consumption is very high.
> 
>  
> 2018-09-26T04:15:30.409Z|00007|reconnect|ERR|tcp:127.0.0.1:45874 
> <http://127.0.0.1:45874/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:15:33.661Z|00008|reconnect|ERR|tcp:127.0.0.1:45934 
> <http://127.0.0.1:45934/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:15:33.847Z|00009|reconnect|ERR|tcp:127.0.0.1:45894 
> <http://127.0.0.1:45894/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:03.247Z|00010|reconnect|ERR|tcp:127.0.0.1:45958 
> <http://127.0.0.1:45958/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:21.534Z|00011|reconnect|ERR|tcp:127.0.0.1:45956 
> <http://127.0.0.1:45956/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:21.786Z|00012|reconnect|ERR|tcp:127.0.0.1:45974 
> <http://127.0.0.1:45974/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:47.085Z|00013|reconnect|ERR|tcp:127.0.0.1:45988 
> <http://127.0.0.1:45988/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:49.618Z|00014|reconnect|ERR|tcp:127.0.0.1:45982 
> <http://127.0.0.1:45982/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:16:53.321Z|00015|reconnect|ERR|tcp:127.0.0.1:45964 
> <http://127.0.0.1:45964/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:17:15.543Z|00016|reconnect|ERR|tcp:127.0.0.1:45986 
> <http://127.0.0.1:45986/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:17:24.767Z|00017|reconnect|ERR|tcp:127.0.0.1:45990 
> <http://127.0.0.1:45990/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:17:31.735Z|00018|reconnect|ERR|tcp:127.0.0.1:45998 
> <http://127.0.0.1:45998/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:20:12.593Z|00019|reconnect|ERR|tcp:127.0.0.1:46014 
> <http://127.0.0.1:46014/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:23:51.996Z|00020|reconnect|ERR|tcp:127.0.0.1:46028 
> <http://127.0.0.1:46028/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:25:12.187Z|00021|reconnect|ERR|tcp:127.0.0.1:46022 
> <http://127.0.0.1:46022/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:25:28.871Z|00022|reconnect|ERR|tcp:127.0.0.1:46056 
> <http://127.0.0.1:46056/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:27:11.663Z|00023|reconnect|ERR|tcp:127.0.0.1:46046 
> <http://127.0.0.1:46046/>: no response to inactivity probe after 5 seconds, 
> disconnecting
> 2018-09-26T04:29:56.161Z|00024|jsonrpc|WARN|tcp:127.0.0.1:46018 
> <http://127.0.0.1:46018/>: receive error: Connection reset by peer
> 2018-09-26T04:29:56.161Z|00025|reconnect|WARN|tcp:127.0.0.1:46018 
> <http://127.0.0.1:46018/>: connection dropped (Connection reset by peer)
> 
> This definitely kills the connection for a few seconds before it reconnects. 
> So, I’ve been wondering, what is this probe and what is really happening 
> here? What’s the cause and is there a way to fix this? 
> 
> Openvswitch version is 2.9.0-3 on CentOS 7 with Openstack Pike running on it 
> (but the issues show up on Queens too).
> 
>  
> Jean-Philippe Méthot
> Openstack system administrator
> Administrateur système Openstack
> PlanetHoster inc.
> 
> 
> 
> 
> _______________________________________________
> discuss mailing list
> [email protected] <mailto:[email protected]>
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-discuss>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to