Zhou Zheng Sheng has posted comments on this change.

Change subject: testRegeneration of remoteFileHandler fails when running all 
tests
......................................................................


Patch Set 1: (1 inline comment)

....................................................
File tests/remoteFileHandlerTests.py
Line 51:         the requests"""
Line 52:         for i in range(HANDLERS_NUM * 2):
Line 53:             self.testTimeout()
Line 54:         for i in range(HANDLERS_NUM):
Line 55:             self.testEcho()
Thanks Yaniv, but from the code I think the pool will never get full because we 
are calling callCrabRPCFunction in a synchronise manner as I pointed out the 
the last comment. To prove that, we can add a print statement in 
RemoteFileHandlerPool.callCrabRPCFunction() at line 275 as follow.

                handler = self.handlers[i]
                if not self._isHandlerAvailable(handler):
                    handler = self.handlers[i] = PoolHandler()

                print "TTTTTTTTTT: using handler", i
                return handler.proxy.callCrabRPCFunction(timeout, name,
                                                         *args, **kwargs)

Then

  ./run_tests_local.sh -s remoteFileHandlerTests.py

With -s option, nose will not capture stdout so we can see which handler we are 
using. Then I get the following results in my machine.

nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
PoolHandlerTests
    testStop                                                    OK
RemoteFileHandlerTests
    testEcho                                                    TTTTTTTTTT: 
using handler 0
OK
    testRegeneration                                            TTTTTTTTTT: 
using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
TTTTTTTTTT: using handler 0
OK
    testTimeout                                                 TTTTTTTTTT: 
using handler 0
OK

----------------------------------------------------------------------
Ran 4 tests in 1.842s

It always use handler[0]. The operations are as follow

go through all the handler slots (from handler[0] to handler[N])
  lock handler[i], success, meaning this handler slot is free
  find out the value in the slot is None
    instantiate a new PoolHandler and put it in the slot
  call a timeout function
  catch the timeout exception
  kill the PoolHandler instance in that slot
  put None to the slot
  unlock hanlder slot and re-raise the timeout exception

So every time when we leave callCrabRPCFunction, it is guaranteed the slot lock 
is released. If we are calling callCrabRPCFunction again and again in 
synchronous manner, we will always lock the hanlder[0] successfully and 
continue to use it again and again. Have a look at the "except Timeout" code 
block, It is guaranteed that None is put to the handler[i]. So the next time 
this slot is used, we get a new instance of PoolHandler. So the remote timeout 
and echo calls always run in synchronous using handler[0], and every time after 
a timeout call, a new PoolHandler instance is put to handler[0] to serve the 
echo call.

To test the situation you mentioned above, the remote calls must be initiated 
from many other threads, and the thread number must be greater than the slot 
number.

So I still can not see why echo will get timeout. I did see a failure report 
from Jenkins at 
http://jenkins.ovirt.org/job/vdsm_unit_tests_manual_gerrit/160/console , but I 
can not reproduce it. Can you suggest a way to produce this error?
Line 56: 
Line 57:     def tearDown(self):
Line 58:         self.pool.close()
Line 59: 


--
To view, visit http://gerrit.ovirt.org/9412
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35ae1258d01455ad2fe131cd17bc3dff89224c1b
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Yaniv Bronhaim <[email protected]>
Gerrit-Reviewer: Dan Kenigsberg <[email protected]>
Gerrit-Reviewer: Royce Lv <[email protected]>
Gerrit-Reviewer: Saggi Mizrahi <[email protected]>
Gerrit-Reviewer: Yaniv Bronhaim <[email protected]>
Gerrit-Reviewer: Zhou Zheng Sheng <[email protected]>
Gerrit-Reviewer: oVirt Jenkins CI Server
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to