On 10-Sep-09, at 12:47 AM, Sturla Molden wrote: > The CPU is equally great (or better?) for doing dot(). In both cases: > > - memory access scale O(n) for dot producs. > - computation scale O(n) for dot producs. > - memory is low > - computation is fast (faster for GPU)
You do realize that the throughput from onboard (video) RAM is going to be much higher, right? It's not just the parallelization but the memory bandwidth. And as James pointed out, if you can keep most of your intermediate computation on-card, you stand to benefit immensely, even if doing some operations where the GPU provides no tangible benefit (i.e. the benefit is in aggregate and avoiding copies). FWIW I agree with you that NumPy isn't the place for GPU stuff to happen. In the short to medium term we need a way to make it simpler for naturally expressed computations not go hog wild with temporary allocations (it's a very hard problem given the constraints of the language). In the long term I envision something with flexible enough machinery to be manipulating objects in GPU memory with the same ease as in main memory, but I think the path to that lies in increasing the generality and flexibility of the interfaces exposed. David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion