I take yesterday's statement back: I am able to reproduce this behavior with a C program: The trick was to allocate/free the memory in the same order in C as the Python process would.

I've opened a thread about it at nvidia's developer zone here: https://devtalk.nvidia.com/default/topic/760060/cuda-programming-and-performance/cudaerrorillegaladdress-on-kepler-gpus-but-program-runs-fine-on-fermi-gpus/

Cheers

Thomas



On 2014-07-07 17:36, Thomas Unterthiner wrote:
I did some more digging on this error... first of all, doing a C++ re-implementation of the code I was not able to reproduce it, so at this point I'm assuming it's a bug in PyCUDA.

By enabling CUDA traces in PyCUDA, I was able to nail the error down to a cuMemFree call that fails with code 700 (which is a CUDA_ERROR_ILLEGAL_ADDRESS error). Interestingly, the error goes away if I manually delete the memory. Meaning the following code runs through without a hitch:


import pycuda.autoinit
from pycuda import gpuarray
from scikits.cuda.cublas import cublasSgemm
import scikits.cuda.autoinit
from scikits.cuda.misc import _global_cublas_handle as handle

n, m, k = 131, 2483, 3
for i in range(5):
    print i
    s = slice(128, n)
    b = gpuarray.empty((m, k), dtype=np.float32)
    c = gpuarray.empty((m, m), dtype=np.float32)
    a = gpuarray.zeros((n, m), dtype=np.float32)
    ks = a[s].shape[0]
cublasSgemm(handle, 'n', 'n', m, m, ks, np.float32(1.0), a[s].gpudata, m, b.gpudata, k, np.float32(0.0), c.gpudata, m)
    del(c, a, b)


However, if I comment out the `del` statement on the last line, the error re-appears. If I switch to using a DeviceMemoryPool allocator, the error will appear as soon as I call `DeviceMemoryPool.free_held()`.

Cheers

Thomas

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda


_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to