Hi, Andreas --- I'm coming a little late to the discussion, but here is the
specific use case we have.  Maybe more context will make the correct fix
clearer.

In a project we have, we wanted to mix PyCUDA kernel code with the CULA port
of LAPACK to CUDA.  To do this, we built ctypes wrappers for the CULA
functions and then wrote some routines to let it interface cleanly with
PyCUDA using GPUArray objects.

What we have now, using the hack described by Garrett (a.k.a. gerald) in his
other email reasonably works well, with some small caveats about memory
allocation and context handling that aren't totally clear to me.  Until
we're sure that things are being done the correct way, we don't want to
release anything.

So two questions:

(1) The reason we were making dummy allocation objects is that even though
runtime and kernel code are, in principle, now compatible, using PyCUDA's
allocator doesn't seem to work for us.  Is there a more correct way of doing
things that will let us avoid the dummy allocation objects that wrap CULA's
device malloc?

(2) Another thing we tried that didn't work was passing an allocation
function to the GPUArray constructor, but I guess it's supposed to return a
real DeviceAllocation.  Is that right?

On Sat, Jun 19, 2010 at 11:15 PM, Andreas Kloeckner <li...@informa.tiker.net
> wrote:

>  I though the issue revolved around memory allocated by someone else that
> needed its own non-device_allocation RAII holder.
>


-- 
Louis Theran
Research Assistant Professor
Math Department, Temple University
http://math.temple.edu/~theran/
+1.215.204.3974
_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to