> A couple of thoughts on parallelism: > > 1. Can someone come up with a small set of cases and time them on > numpy, IDL, Matlab, and C, using various parallel schemes, for each of > a representative set of architectures? We're comparing a benchmark to > itself on different architectures, rather than seeing whether the > thread capability is helping our competition on the same architecture. > If it's mostly not helping them, we can forget it for the time being. > I suspect that it is, in fact, helping them, or at least not hurting > them. > > Well I could ask some IDL users to provide you with benchmarks. In C/OpenMP I have posted a trivial code.
> 2. Would it slow things much to have some state that the routines > check before deciding whether to run a parallel implementation or not? > It could default to single thread except in the cases where > parallelism always helps, but the user can configure it to multithread > beyond certain threshholds of, say, number of elements. Then, in the > short term, a savvy user can tweak that state to get parallelism for > more than N elements. In the longer term, there could be a test > routine that would run on install and configure the state for that > particular machine. When numpy started it would read the saved file > and computation would be optimized for that machine. The user could > always override it. > > No it wouldn't cost that much and that is exactly the way IDL (for instance) works. > 3. We should remember the first rule of parallel programming, which > Anne quotes as "premature optimization is the root of all evil". > There is a lot to fix in numpy that is more fundamental than speed. I > am the first to want things fast (I would love my secondary eclipse > analysis to run in less than a week), but we have gaping holes in > documentation and other areas that one would expect to have been > filled before a 1.0 release. I hope we can get them filled for 1.1. > It bears repeating that our main resource shortage is in person-hours, > and we'll get more of those as the community grows. Right now our > deficit in documentation is hurting us badly, while our deficit in > parallelism is not. There is no faster way of growing the community > than making it trivial to learn how to use numpy without hand-holding > from an experienced user. Let's explore parallelism to assess when > and how it might be right to do it, but let's stay focussed on the > fundamentals until we have those nailed. > > That is well put and clear. It is also clear that our deficit in parallelism is not hurting us that badly. It is a real problem in some communities like astronomers and images processing people but the lack of documentation is the first one, that is true. Xavier _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion