On 27Nov2014 17:55, Dave Angel <[email protected]> wrote:
On 11/27/2014 04:01 PM, Albert-Jan Roskam wrote:
I made a comparison between multiprocessing and threading. In the code below
(it's also here: http://pastebin.com/BmbgHtVL, multiprocessing is more than 100
(yes: one hundred) times slower than threading! That is
I-must-be-doing-something-wrong-ishly slow. Any idea whether I am doing
something wrong? I can't believe the difference is so big.
The bulk of the time is spent marshalling the data to the dictionary
self.lookup. You can speed it up some by using a list there (it also
makes the code much simpler). But the real trick is to communicate
less often between the processes.
[...]
Exactly so. You're being bitten by latency and of course the sheer cost of
copying stuff around. With a thread the latency and copying is effectively
zero: the data structure you're using in one thread is the same data structure
in use by another. With multiprocessing they're completely separate (distinct
memory spaces); data must be passed from one to the other, and there's a cost
for that.
By treating multiprocessing like threading in terms of the shared data, you're
making lots of little updates.
See sig quote.
Cheers,
Cameron Simpson <[email protected]>
The Eight Fallacies of Distributed Computing - Peter Deutsch
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn't change
6. There is one administrator
7. Transport cost is zero
8. The network is homogeneous
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor