Hi All - I needed to slice a GPUArray and then pass the gpudata of the resulting slice to a CUDA kernel expecting a pointer. The current slicing logic in pycuda.gpuarray.GPUArray calculates the gpudata of the slice as an integer, which causes problems if I try to pass it as a pointer to a CUDA Kernel, due to the type mismatch between int and pointer.
I've solved this problem for myself by changing a couple things: 1. Adding a constructor to cuda.hpp/device_allocation: device_allocation(CUdeviceptr devptr, bool valid). This allows me to create a device_allocation object which will not be freed upon destruction. Obviously, the resulting pointer from a slicing operation that constructs a view of a gpuarray should never be freed. 2. Changing the constructor on wrap_cudadrv.cpp/DeviceAllocation from py::no_init to py::init<CUdeviceptr, bool>(), which allows me to instantiate a DeviceAllocation object from within python. Andreas, I have the feeling you won't like exposing this, but it was a quick solution that worked for me. 3. Changing the way the new gpudata is calculated in pycuda.gpuarray.GPUArray.__getitem__(), to create an "invalid" DeviceAllocation object that will not be freed upon destruction: gpudata=drv.DeviceAllocation(int(self.gpudata) + start*self.dtype.itemsize, False) I've attached the patch, in case it's useful. - bryan
0001-Allow-GPUArray-slices-to-be-used-in-CUDA-kernels-whi.patch
Description: Binary data
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda