David Cournapeau wrote: > Sturla Molden wrote: > >> On 12/11/2008 6:10 PM, Michael Gilbert wrote: >> >> >> >>> Shouldn't numpy (and/or multiprocessing) be smart enough to prevent >>> this kind of error? A simple enough solution would be to also include >>> the process id as part of the seed >>> >>> >> It would not help, as the seeding is done prior to forking. >> >> I am mostly familiar with Windows programming. But what is needed is a >> fork handler (similar to a system hook in Windows jargon) that sets a >> new seed in the child process. >> >> Could pthread_atfork be used? >> >> > > The seed could be explicitly set in each task, no ? > > def task(x): > np.random.seed() > return np.random.random(x) > > But does this really make sense ? > > Is the goal to parallelize a big sampler into N tasks of M trials, to > produce the same result as a sequential set of M*N trials ? Then it does > sound like a trivial task at all. I know there exists libraries > explicitly designed for parallel random number generation - maybe this > is where we should look, instead of using heuristics which are likely to > be bogus, and generate wrong results. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > This is not sufficient because you can not ensure that the seed will be different every time task() is called.
A major part of the problem here is treating a parallel computing problem as a serial computing problem. The streams must be independent across threads especially avoiding cross-correlation of streams (another gotcha) between threads. It is up to the user to implement a thread-safe solution such as using a single stream that is used by all threads or force the different threads to start at different states. The only thing that Numpy could do is provide a parallel pseudo-random number generator. Bruce _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion