[ 
https://issues.apache.org/jira/browse/THRIFT-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791157#comment-14791157
 ] 

ASF GitHub Bot commented on THRIFT-2918:
----------------------------------------

GitHub user bufferoverflow opened a pull request:

    https://github.com/apache/thrift/pull/606

    THRIFT-2918 Race condition in Python TProcessPoolServer test

    This makes python test pass on my machine.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bufferoverflow/thrift THRIFT-2918

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/thrift/pull/606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #606
    
----
commit 66192cf5394c71d499db179515b89820858fe402
Author: Roger Meier <ro...@apache.org>
Date:   2015-09-16T17:53:07Z

    THRIFT-2918 Race condition in Python TProcessPoolServer test

----


> Race condition in Python TProcessPoolServer test
> ------------------------------------------------
>
>                 Key: THRIFT-2918
>                 URL: https://issues.apache.org/jira/browse/THRIFT-2918
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Library
>         Environment: openSUSE 13.2
> Python 2.7.8 (default, Sep 30 2014, 15:34:38) [GCC] on linux2
>            Reporter: Jens Geyer
>            Assignee: Jens Geyer
>             Fix For: 0.9.3
>
>         Attachments: THRIFT-2918-python-test-timeouts.patch
>
>
> {{make check}} gets stuck very reproducible in my VM at Test run #218:
> {code}
> Test run #217:  (includes gen-py-default) Server=TProcessPoolServer,  
> Proto=accel,  zlib=False,  SSL=False
> Testing server TProcessPoolServer: /usr/bin/python ./TestServer.py 
> --genpydir=gen-py-default --protocol=accel --port=9090 TProcessPoolServer
> Testing client: /usr/bin/python ./TestClient.py --genpydir=gen-py-default 
> --protocol=accel --port=9090 --transport=buffered
> ...testException(Safe)
> testException(Xception)
> testException(throw_undeclared)
> ...............
> ----------------------------------------------------------------------
> Ran 18 tests in 0.563s
> OK
> Giving TProcessPoolServer (proto=accel,zlib=False,ssl=False) an extra 3 
> seconds for childprocesses to terminate via alarm
> Terminating worker: <Process(Process-1, started daemon)>
> Terminating worker: <Process(Process-2, started daemon)>
> Terminating worker: <Process(Process-3, started daemon)>
> Terminating worker: <Process(Process-4, started daemon)>
> Terminating worker: <Process(Process-5, started daemon)>
> Requesting server to stop()
> OK: Finished (includes gen-py-default)  TProcessPoolServer / accel proto / 
> zlib=False / SSL=False.   217 combinations tested.
> Test run #218:  (includes gen-py-default) Server=TProcessPoolServer,  
> Proto=accel,  zlib=False,  SSL=True
> Testing server TProcessPoolServer: /usr/bin/python ./TestServer.py 
> --genpydir=gen-py-default --protocol=accel --port=9090 --ssl 
> TProcessPoolServer
> Testing client: /usr/bin/python ./TestClient.py --genpydir=gen-py-default 
> --protocol=accel --port=9090 --ssl --transport=buffered
> ...testException(Safe)
> testException(Xception)
> testException(throw_undeclared)
> ..........Terminating worker: <Process(Process-1, started daemon)>
> Terminating worker: <Process(Process-2, started daemon)>
> Terminating worker: <Process(Process-3, started daemon)>
> Terminating worker: <Process(Process-4, started daemon)>
> Terminating worker: <Process(Process-5, started daemon)>
> Requesting server to stop()
> {code}
> After fiddling a bit around with it I got it to work by increasing the 
> alarm() timeout from 2 seconds to 4 seconds. 
> I'm not a Python expert, but the code looks somewhat interesting to me: 
> - The server code starts the workers, but some piece of code outside of the 
> server is responsible for terminating them. Is that really idiomatic in 
> Python or just bad design? 
> - The Condition() object used in TProcessPoolsServer.py should probably be 
> replaced by an Event() object. Especially, as Condition.wait() [seems to have 
> it's own 
> perils|http://stackoverflow.com/questions/24137480/threading-condition-waittimeout-ignores-threading-condition-notify]
>  and Event is much easier to use.
> - Calling Condition.aquire() without a matching release() within a {{while 
> True:}} loop looks also not very convincing to me. AFAIK the second call to 
> aquire() will block, if that ever happens (it did not in my tests).
> The bad news is, that neither of the proposed changes above had any effect on 
> the race conditions, except increasing the timeout - but that is merely a 
> workaround, not a solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to