On 03/11/2013 01:57 AM, Abhinav M Kulkarni wrote:
<SNIP> * My laptop has quad-core Intel i5 processor, so I thought using multiprocessing module I can parallelize my code (basically calculate gradient in parallel on multiple cores simultaneously).
<SNIP>
* As a result I end up creating a process for each data point (instead of a thread that I would ideally like to do, so as to avoid process creation overhead).
Seems you only need 4 processes, as you have 4 cores. Instead of creating a new one each time, reuse the same 4 processes, letting each do a quarter of the data.
It's not the process creation that's particularly slow, but all the initialization of starting another instance of Python. If you're on Linux, you might be able to speed that up by using fork, but I don't specifically know.
-- DaveA -- http://mail.python.org/mailman/listinfo/python-list