On 27Nov2014 17:55, Dave Angel <[email protected]> wrote:
On 11/27/2014 04:01 PM, Albert-Jan Roskam wrote:
I made a comparison between multiprocessing and threading.  In the code below 
(it's also here: http://pastebin.com/BmbgHtVL, multiprocessing is more than 100 
(yes: one hundred) times slower than threading! That is 
I-must-be-doing-something-wrong-ishly slow. Any idea whether I am doing 
something wrong? I can't believe the difference is so big.

The bulk of the time is spent marshalling the data to the dictionary self.lookup. You can speed it up some by using a list there (it also makes the code much simpler). But the real trick is to communicate less often between the processes.
[...]

Exactly so. You're being bitten by latency and of course the sheer cost of copying stuff around. With a thread the latency and copying is effectively zero: the data structure you're using in one thread is the same data structure in use by another. With multiprocessing they're completely separate (distinct memory spaces); data must be passed from one to the other, and there's a cost for that.

By treating multiprocessing like threading in terms of the shared data, you're making lots of little updates.

See sig quote.

Cheers,
Cameron Simpson <[email protected]>

The Eight Fallacies of Distributed Computing - Peter Deutsch
1.      The network is reliable
2.      Latency is zero
3.      Bandwidth is infinite
4.      The network is secure
5.      Topology doesn't change
6.      There is one administrator
7.      Transport cost is zero
8.      The network is homogeneous
_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to