Dan Kenigsberg has submitted this change and it was merged. Change subject: protocoldetector: Fix busy poll loop [1/3] ......................................................................
protocoldetector: Fix busy poll loop [1/3] Watching top during protocoldetector tests show very high (100%) cpu usage during the slow client test. Profiling test_detect_slow_client test show that the reactor is running a busy poll loop: Sat Apr 18 15:43:05 2015 test-use-ssl.prof 191917 function calls (191928 primitive calls) in 0.367 seconds Ordered by: internal time List reduced from 197 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 6598 0.091 0.000 0.325 0.000 asyncore.py:170(poll2) 32987 0.074 0.000 0.143 0.000 betterAsyncore.py:139(Dispatcher._delegate_call) 6598 0.036 0.000 0.160 0.000 betterAsyncore.py:209(Reactor._get_timeout) 6598 0.024 0.000 0.029 0.000 asyncore.py:416(AsyncoreEvent.__getattr__) 13194 0.022 0.000 0.064 0.000 betterAsyncore.py:69(Dispatcher.next_check_interval) 6598 0.015 0.000 0.343 0.000 asyncore.py:205(loop) 6597 0.012 0.000 0.068 0.000 asyncore.py:105(readwrite) 13194 0.012 0.000 0.057 0.000 betterAsyncore.py:51(Dispatcher.readable) 13194 0.012 0.000 0.069 0.000 betterAsyncore.py:54(Dispatcher.writable) 6595 0.011 0.000 0.029 0.000 protocoldetector.py:87(_ProtocolDetector.next_check_interval) Sat Apr 18 15:46:09 2015 test-no-ssl.prof 184319 function calls (184329 primitive calls) in 0.359 seconds Ordered by: internal time List reduced from 148 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 6340 0.090 0.000 0.320 0.000 asyncore.py:170(poll2) 31696 0.073 0.000 0.139 0.000 betterAsyncore.py:139(Dispatcher._delegate_call) 6340 0.035 0.000 0.155 0.000 betterAsyncore.py:209(Reactor._get_timeout) 6340 0.022 0.000 0.026 0.000 asyncore.py:416(AsyncoreEvent.__getattr__) 12678 0.022 0.000 0.062 0.000 betterAsyncore.py:69(Dispatcher.next_check_interval) 6340 0.015 0.000 0.338 0.000 asyncore.py:205(loop) 12678 0.012 0.000 0.057 0.000 betterAsyncore.py:51(Dispatcher.readable) 6339 0.012 0.000 0.064 0.000 asyncore.py:105(readwrite) 12678 0.012 0.000 0.068 0.000 betterAsyncore.py:54(Dispatcher.writable) 6338 0.011 0.000 0.028 0.000 protocoldetector.py:87(_ProtocolDetector.next_check_interval) The root cause is not implementing writable() method, inheriting the writable() method from asyncore.dispatcher, returning True. The reactor poll the sockets for POLLOUT, and since a socket is always writable if you never write to it, the reactor is waking up immediately. This creates a busy loop, invoking poll 6340 times for handling one connection, sending one line. Now we implement writable(), so the reactor wakes up only when data can be read from a socket, or when a dispatcher has expired. Profiling same test after this show expected behavior: Sat Apr 18 15:53:40 2015 test-use-ssl-after.prof 686 function calls (697 primitive calls) in 0.005 seconds Ordered by: internal time List reduced from 199 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 4 0.001 0.000 0.001 0.000 asyncore.py:416(AsyncoreEvent.__getattr__) 2 0.000 0.000 0.000 0.000 pthread.py:129(Cond.wait) 9 0.000 0.000 0.000 0.000 __init__.py:242(LogRecord.__init__) 4 0.000 0.000 0.000 0.000 socket.py:185(_socketobject.__init__) 4 0.000 0.000 0.004 0.001 asyncore.py:170(poll2) 24 0.000 0.000 0.000 0.000 socket.py:223(meth) 16 0.000 0.000 0.004 0.000 betterAsyncore.py:139(Dispatcher._delegate_call) 1 0.000 0.000 0.004 0.004 protocoldetectorTests.py:184(AcceptorTests.check_slow_client) 15 0.000 0.000 0.000 0.000 pthread.py:95(RLock.lock) 1 0.000 0.000 0.000 0.000 ssl.py:294(SSLSocket.close) Sat Apr 18 16:23:43 2015 test-no-ssl-after.prof 541 function calls (551 primitive calls) in 0.006 seconds Ordered by: internal time List reduced from 145 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 2 0.002 0.001 0.002 0.001 threading.py:561(_Event.__init__) 8 0.000 0.000 0.001 0.000 __init__.py:242(LogRecord.__init__) 2 0.000 0.000 0.000 0.000 pthread.py:129(Cond.wait) 1 0.000 0.000 0.001 0.001 protocoldetectorTests.py:184(AcceptorTests.check_slow_client) 20/21 0.000 0.000 0.001 0.000 socket.py:223(meth) 8 0.000 0.000 0.001 0.000 __init__.py:736(MyMemoryHandler.handle) 3 0.000 0.000 0.000 0.000 socket.py:185(_socketobject.__init__) 13 0.000 0.000 0.000 0.000 pthread.py:95(RLock.lock) 8 0.000 0.000 0.001 0.000 __init__.py:1302(Logger.callHandlers) 10 0.000 0.000 0.004 0.000 betterAsyncore.py:139(Dispatcher._delegate_call) Change-Id: Id18d87d10db63f216699bd60ee82fd3ff08e7624 Signed-off-by: Nir Soffer <nsof...@redhat.com> Reviewed-on: https://gerrit.ovirt.org/39996 Reviewed-by: Dima Kuznetsov <dkuzn...@redhat.com> Reviewed-by: Piotr Kliczewski <piotr.kliczew...@gmail.com> Reviewed-by: Dan Kenigsberg <dan...@redhat.com> --- M lib/vdsm/sslutils.py M vdsm/protocoldetector.py 2 files changed, 9 insertions(+), 0 deletions(-) Approvals: Piotr Kliczewski: Looks good to me, but someone else must approve Nir Soffer: Verified Dima Kuznetsov: Looks good to me, but someone else must approve Dan Kenigsberg: Looks good to me, approved -- To view, visit https://gerrit.ovirt.org/39996 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id18d87d10db63f216699bd60ee82fd3ff08e7624 Gerrit-PatchSet: 3 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Nir Soffer <nsof...@redhat.com> Gerrit-Reviewer: Allon Mureinik <amure...@redhat.com> Gerrit-Reviewer: Dan Kenigsberg <dan...@redhat.com> Gerrit-Reviewer: Dima Kuznetsov <dkuzn...@redhat.com> Gerrit-Reviewer: Nir Soffer <nsof...@redhat.com> Gerrit-Reviewer: Piotr Kliczewski <piotr.kliczew...@gmail.com> Gerrit-Reviewer: Yaniv Bronhaim <ybron...@redhat.com> Gerrit-Reviewer: automat...@ovirt.org Gerrit-Reviewer: oVirt Jenkins CI Server _______________________________________________ vdsm-patches mailing list vdsm-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches