On 10-Sep-09, at 12:47 AM, Sturla Molden wrote:

> The CPU is equally great (or better?) for doing dot(). In both cases:
>
> - memory access scale O(n) for dot producs.
> - computation scale O(n) for dot producs.
> - memory is low
> - computation is fast (faster for GPU)

You do realize that the throughput from onboard (video) RAM is going  
to be much higher, right? It's not just the parallelization but the  
memory bandwidth. And as James pointed out, if you can keep most of  
your intermediate computation on-card, you stand to benefit immensely,  
even if doing some operations where the GPU provides no tangible  
benefit (i.e. the benefit is in aggregate and avoiding copies).

FWIW I agree with you that NumPy isn't the place for GPU stuff to  
happen. In the short to medium term we need a way to make it simpler  
for naturally expressed computations not go hog wild with temporary  
allocations (it's a very hard problem given the constraints of the  
language). In the long term I envision something with flexible enough  
machinery to be manipulating objects in GPU memory with the same ease  
as in main memory, but I think the path to that lies in increasing the  
generality and flexibility of the interfaces exposed.

David
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to