On Thu, Apr 20, 2023 at 10:14:14AM +0200, Stefan Hoffmann wrote: > In some cases ovsdb server or relay gets restarted, ovsdb python clients > may keep the local socket open. Instead of reconnecting a lot of failures > will be logged. > This can be reproduced with ssl connections to the server/relay and > restarting it, so it has the same IP after restart. > > This patch catches the Exceptions at do_handshake to recreate the > connection on the client side. > > Tracebacks from the issue: > > Traceback (most recent call last): > File > \"/usr/local/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/connection.py\", > line 107, in run > self.idl.run() > File > \"/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/db/idl.py\", > line 433, in run > self._session.run() > File > \"/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/jsonrpc.py\", > line 519, in run > error = self.stream.connect() > File > \"/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/stream.py\", > line 824, in connect > self.socket.do_handshake() > File \"/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py\", line > 312, in do_handshake > return self._call_trampolining( > File \"/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py\", line > 158, in _call_trampolining > return func(*a, **kw) > File \"/usr/local/lib/python3.9/ssl.py\", line 1310, in do_handshake > self._sslobj.do_handshake() > ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1129) > > 2023-04-03 14:06:43.458 1 ERROR ovsdbapp.backend.ovs_idl.connection > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection [-] > TLS/SSL connection has been closed (EOF) (_ssl.c:997): > ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:997) > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection Traceback > (most recent call last): > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection File > "/usr/local/lib/python3.10/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 107, in run > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > self.idl.run() > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection File > "/usr/local/lib/python3.10/dist-packages/ovs/db/idl.py", line 433, in run > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > self._session.run() > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection File > "/usr/local/lib/python3.10/dist-packages/ovs/jsonrpc.py", line 519, in run > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection error > = self.stream.connect() > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection File > "/usr/local/lib/python3.10/dist-packages/ovs/stream.py", line 824, in connect > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > self.socket.do_handshake() > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection File > "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > self._sslobj.do_handshake() > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:997) > 2023-04-03 14:06:43.513 1 ERROR ovsdbapp.backend.ovs_idl.connection > 2023-04-03 14:06:43.567 1 ERROR ovsdbapp.backend.ovs_idl.connection [-] > TLS/SSL connection has been closed (EOF) (_ssl.c:997): > ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:997) > > Traceback (most recent call last): > File > "/usr/local/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 107, in run > self.idl.run() > File > "/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/db/idl.py", > line 433, in run > self._session.run() > File > "/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/jsonrpc.py", > line 519, in run > error = self.stream.connect() > File > "/usr/local/lib/python3.9/site-packages/ovs-3.1.0-py3.9.egg/ovs/stream.py", > line 824, in connect > self.socket.do_handshake() > File "/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py", line > 312, in do_handshake > return self._call_trampolining( > File "/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py", line > 158, in _call_trampolining > return func(*a, **kw) > File "/usr/local/lib/python3.9/ssl.py", line 1305, in do_handshake > self._check_connected() > File "/usr/local/lib/python3.9/ssl.py", line 1089, in _check_connected > self.getpeername() > > OSError: [Errno 107] Transport endpoint is not connected > > Signed-off-by: Stefan Hoffmann <stefan.hoffm...@cloudandheat.com> > Signed-off-by: Luca Czesla <luca.czesla@mail.schwarz> > Signed-off-by: Max Lamprecht <max.lamprecht@mail.schwarz> > Co-authored-by: Luca Czesla <luca.czesla@mail.schwarz> > Co-authored-by: Max Lamprecht <max.lamprecht@mail.schwarz>
Hi Stefan, thanks for your patch. I do see CI failures, but I think these are false negatives: * https://patchwork.ozlabs.org/project/openvswitch/patch/3f70ca7bafad296e18ed9579f30fd7044c47fc61.ca...@cloudandheat.com/ I'm retrying the GitHub based jobs here. * https://github.com/horms/ovs/actions/runs/4753793285 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev