I modified add_dot() to use cublas.xt.cublasXtSgemm. I don't think I
need to modify dot() because its calling add_dot at the end. Its not
calling cublasxt.cublasXtsgemm directly unless my matrix is 1d (which
it isn't) Correct?

BTW, smaller matrices work fine its just for larger matrices.


On Mon, Nov 23, 2015 at 11:35 AM, Lev Givon <l...@columbia.edu> wrote:
> Received from Keith Brown on Mon, Nov 23, 2015 at 11:10:45AM EST:
>> I have a 2 small matrix (160080,3) of type float32 and I am
>> calculating their dot product. While doing this, I keep getting
>> pycuda.__driver.MemoryError: cuMemAlloc failed out of memory.
>>
>> I have 2 cards, each with 3GB of memory. Each matrix takes about 1875
>> kilobytes. I am not sure why this is occuring.
>>
>> x=np.ones((160080,3L)).astype(np.float32)
>> a_gpu=gpuarray.to_gpu(x)
>> b_gpu=gpuarray.to_gpu(x)
>> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle)
>>
>> My handle is a cublasxt (not regular cublas since blasxt apprently
>> does better memory handling).
>>
>> Any idea what is going on?
>
> Did you also modify skcuda.linalg.dot() to explicitly call the cublasXt*gemm
> functions rather than the stock cublas*gemm functions? The cublasXt*gemm
> functions expect host memory pointers as their arguments, not GPU memory
> pointers.
> --
> Lev Givon
> Bionet Group | Neurokernel Project
> http://lebedov.github.io/
> http://neurokernel.github.io/
>

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to