my experience with trying to cuda-ize svd/nmf calculations is that they're not really a good fit for cuda. specifically, most of your expensive operations are matrix multiplications over very long and narrow matrices. (mxk or kxn), where m~=n (within an order of mag) but k<<(m|n). even when m~=2^16 (the max for cublas matrices) and k<2^8, i was barely breaking even with normal cpu-based blas libs.

derek


ananth ranga wrote:
Hello people,

         I am Ranga a new member to the group.  I have a problem of
finding svd of a matrix of size 120*100. On a CPU with the VTK
implemented  version its taking about 5 ms for evaluation. So I was
wondering if a pycuda version of it could give me abetter reult
regarding the speed.

If any one has a pycuda version of SVD calculation could you please help me out.

Thanks,
ranga

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to