A Friday 22 May 2009 13:59:17 Andrew Friedley escrigué: > Using multiple cores is pretty easy for element-wise ufuncs; no > communication needs to occur and the work partitioning is trivial. And > actually I've found with some initial testing that multiple cores does > still help when you are memory bound. I don't fully understand why yet, > though I have some ideas. One reason is multiple memory controllers due > to multiple sockets (ie opteron).
Yeah. I think this must likely be the reason. If, as in your case, you have several independent paths from different processors to your data, then you can achieve speed-ups even if you are having a memory bound in a one-processor scenario. > Another is that each thread is > pulling memory from a different bank, utilizing more bandwidth than a > single sequential thread could. However if that's the case, we could > possibly come up with code for a single thread that achieves (nearly) > the same additional throughput.. Well, I don't think you can achieve important speed-ups in this case, but experimenting never hurts :) Good luck! -- Francesc Alted _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion