Hi Christian,

On Thu, 30 Sep 2010 20:11:23 -0400, Christian Fobel <christ...@fobel.net> wrote:
> I am trying to combine the use of PyCuda and straight C CUDA code
> (through Boost.Python bindings).  The reason is that I would like to
> have the convenience of allocating memory and copying data to/from the
> device using the PyCuda bindings, but require very accurate timings of
> the code executing on the GPU.  I tried using PyCuda Event objects,
> but the function call overhead is adding too much time.

I find this hard to believe. CUDA launch overhead is a few microseconds,
which is far longer than the Python-C-Python trip time I'm used to from
Boost.Python, so I'd be surprised if the time to enter and return from
the cuEventRecord() wrapper showed up in your measurements. If you
weren't using prepared kernel invocations [1], PyCUDA launch overhead
may have been an issue. What exactly did you observe?

[1]
http://documen.tician.de/pycuda/driver.html#pycuda.driver.Function.prepared_call

> Traceback (most recent call last):
> LaunchError: cuCtxPopCurrent failed: launch failed
> [snip]
> 
> I'm thinking it likely has to do with the C code not being aware of
> the Context being initialized in by PyCuda.  Is there any way to
> reference the Context created by PyCuda?

If you're using pycuda.autoinit, it's simply
pycuda.autoinit.context. [2] Otherwise, you can always *not* use
autoinit and create a context manually.

[2] http://documen.tician.de/pycuda/util.html#module-pycuda.autoinit

> Perhaps I can pass this Context to my C code as well to ensure both
> sets of code are using the same CUDA context?

Well, in the CUDA run-time interface (cuda* instead of cu*), there is no
such thing as an explicit context object. Instead, this context is
implicit in thread state. CUDA 3.0's runtime interface will
automatically use an active driver-level context (e.g. one that PyCUDA
might've created) without too much fuss.

Also, in everything I've seen, launch failures were bugs in code, not
issues in PyCUDA--they're essentially GPU segfaults. If you're on Linux
and you see messages like this in 'dmesg', that's almost certainly
what's happening:

Sep 21 11:11:27 teramite kernel: NVRM: Xid (0001:00): 13, 0003                  
                                                                                
                                                                         
    00000000 000050c0 00000368 00000000 00000100

(It might also be that memory got freed behind your back.)

Bryan Catanzaro (cc'd) has significant experience gluing run-time and
driver code together and might be able to comment on whether he's seen
anything sketchy in doing so.

In any case, I hope this gets you further.

Andreas

Attachment: pgpn1AYw6SzcC.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to