Hi, Andreas --- I'm coming a little late to the discussion, but here is the specific use case we have. Maybe more context will make the correct fix clearer.
In a project we have, we wanted to mix PyCUDA kernel code with the CULA port of LAPACK to CUDA. To do this, we built ctypes wrappers for the CULA functions and then wrote some routines to let it interface cleanly with PyCUDA using GPUArray objects. What we have now, using the hack described by Garrett (a.k.a. gerald) in his other email reasonably works well, with some small caveats about memory allocation and context handling that aren't totally clear to me. Until we're sure that things are being done the correct way, we don't want to release anything. So two questions: (1) The reason we were making dummy allocation objects is that even though runtime and kernel code are, in principle, now compatible, using PyCUDA's allocator doesn't seem to work for us. Is there a more correct way of doing things that will let us avoid the dummy allocation objects that wrap CULA's device malloc? (2) Another thing we tried that didn't work was passing an allocation function to the GPUArray constructor, but I guess it's supposed to return a real DeviceAllocation. Is that right? On Sat, Jun 19, 2010 at 11:15 PM, Andreas Kloeckner <li...@informa.tiker.net > wrote: > I though the issue revolved around memory allocated by someone else that > needed its own non-device_allocation RAII holder. > -- Louis Theran Research Assistant Professor Math Department, Temple University http://math.temple.edu/~theran/ +1.215.204.3974
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda