No problem. It seems that I didn't explain myself clearly enough - my problem comes when using slices of GPUArray objects in C++ code outside of PyCUDA. After some looking around, I discovered the PointerHolderBase class, which solves this particular problem - I just have to ensure that the .gpudata member of a GPUArray is always either a DeviceAllocation object, or something derived from PointerHolderBase. Then my C++ code which tries to extract<CUdeviceptr> is happy. So I'm satisfied and my earlier patch is unnecessary.
Thanks, bryan On Aug 2, 2010, at 6:21 PM, Andreas Kloeckner wrote: > Hey Bryan, > > first of all sorry for the late reply. I just returned from a crazy > sequence of trips and am now working my way through the backlog... > > On Wed, 21 Jul 2010 20:54:14 -0700, Bryan Catanzaro > <catan...@eecs.berkeley.edu> wrote: >> Using numpy.intp works for kernels launched by pycuda. But it doesn't >> work for Pycuda memcpy, which complains that numpy.intp (, which seems >> aliased to numpy.int32 on my machine,) doesn't match the c++ >> signature: >> >> Boost.Python.ArgumentError: Python argument types in >> pycuda._driver.memcpy_dtod(numpy.int32, DeviceAllocation, int) >> did not match C++ signature: >> memcpy_dtod(unsigned int dest, unsigned int src, unsigned int size) > > Ok, I'm beginning to understand what's at work here--Boost Python simply > doesn't like numpy array scalars as integer arguments. My pyublas module > fixes that to some extent, but that's really not viable here. Kernel > arguments go through the buffer interface, so they are an entirely > different affair. > > At first, I was leaning towards believing that's a bug, but the more I > thought about it, the less I could actually believe it. If you use the > following guidelines, I don't think you should run into trouble: > > - When storing GPU pointers in your code, use either DeviceAllocation > objects or bare Python int (or long) objects. These two should be > interchangeable in all things PyCUDA where a device pointer is > requested. If you need arithmetic to work, add int() casts, but be > aware that killing the DeviceAllocation makes your memory go away. Do > not use numpy scalars to store pointers. > > - IF you are using the unprepared kernel invocation syntax (and only > then), you need to convert pointers to numpy.intp to pass as > arguments. > > I believe that if you use these guidelines, you should be ok. > Anything I'm overlooking? > >> And it doesn't work for my own C++ functions (which call CUDA >> functions), which I'm interoperating with Pycuda. They also expect to >> be able to extract a pointer out of gpudata, and when they get a >> numpy.int32, they die. > > See above--is anything requiring that you use numpy scalars? > >> The patch I attached yesterday goes a step in that direction, but >> ultimately what I really want is a C++ implementation of GPUArray. =) > > With all the code generation going on in GPUArray, I actually highly > doubt you want that, but ok. :) > > Andreas _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda