New submission from Irvin Probst:

After checking with the kind people of h...@python.org I think I found a bug in 
the way connections are handled in the multiprocessing module.

Using python 3.3 I've noticed a performance drop of about 70 times when running 
some code performing basic requests on a SyncManager. As this code was burried 
deep into a big project I made a test case attached to this report to reproduce 
this behavior.

Here is what this code does:
- define a class SharedData with several instance variables (a,b and c here)
- this class has two methods exposed through a proxy (good() ans bad()) both 
see a huge performance drop using 3.3 and can be used to reproduce this 
behavior. The only difference is that good() uses a mutex whereas bad() does 
not, I wished to check that mutexes were not to blame for this problem.
- create a Syncmanager giving access to a SharedData instance
- launch a multiprocessing.Process() running the do_stuff() function, this 
function calls 10 times the good() (or bad()) method of SharedData through the 
Syncmanager, passes some values to it and gets back the result.
- after each call to the proxy the time elapsed, roughly measured with 
time.time(), is printed

System specs:
Linux turing 3.12-1-686-pae #1 SMP Debian 3.12.6-2 (2013-12-29) i686 GNU/Linux

Python version:
latests 2.6,2.7,3.2 and 3.3 from standard debian repos
3.3.0 and 3.4.0 beta3 compiled from source

time elapsed in each call to the proxy using Python 2.6, 2.7 and 3.2:
first call to proxy ~ 0.04 seconds, next calls ~0.001 sec

time elapsed in each call to the proxy using Python 3.3.0, 3.3.2, 3.3.3 , 
3.4.0b3:
first call to proxy ~0.27 seconds, next calls: 0.07 sec

I reproduced this behavior using python 2.7 and 3.3.3 on an ubuntu computer 
running the latest amd64 release.

Of course I tried without a print() after each call but it does not change 
anything, python 3.3 is *much* slower here.

Using cProfile I tracked this down to the builtin read() method and indeed 
timing the read() syscall in posix_read() using gettimeofday() confirms it 
takes ages, posix_write() is fine though. I think the problem comes from the 
underlying socket the interpreter is reading from.

connections.py from multiprocessing has been rewrittend between 3.2 and 3.3 but 
I can't see what's wrong in the way it has been done, basic socket options seem 
to be exactly the same.

One interesting point is that it seems to only affect the last bytes sent 
through the socket, e.g if you send a numpy.array big enough to fill the 
socket's read() buffer the first calls to read() are done at a normal speed, 
only the last one takes time.

If you confirm my test code is not to blame it makes IMHO the SyncManager in 
3.3 and 3.4 completely unusable for frequent data exchanges between processes.

Thanks for your time.

----------
components: Library (Lib)
files: test_manager.py
messages: 210457
nosy: Irvin.Probst
priority: normal
severity: normal
status: open
title: Python 3.3/3.4 regression in multiprocessing manager ?
type: performance
versions: Python 3.3, Python 3.4
Added file: http://bugs.python.org/file33956/test_manager.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20540>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to