Hi Christian, On Thu, 30 Sep 2010 20:11:23 -0400, Christian Fobel <christ...@fobel.net> wrote: > I am trying to combine the use of PyCuda and straight C CUDA code > (through Boost.Python bindings). The reason is that I would like to > have the convenience of allocating memory and copying data to/from the > device using the PyCuda bindings, but require very accurate timings of > the code executing on the GPU. I tried using PyCuda Event objects, > but the function call overhead is adding too much time.
I find this hard to believe. CUDA launch overhead is a few microseconds, which is far longer than the Python-C-Python trip time I'm used to from Boost.Python, so I'd be surprised if the time to enter and return from the cuEventRecord() wrapper showed up in your measurements. If you weren't using prepared kernel invocations [1], PyCUDA launch overhead may have been an issue. What exactly did you observe? [1] http://documen.tician.de/pycuda/driver.html#pycuda.driver.Function.prepared_call > Traceback (most recent call last): > LaunchError: cuCtxPopCurrent failed: launch failed > [snip] > > I'm thinking it likely has to do with the C code not being aware of > the Context being initialized in by PyCuda. Is there any way to > reference the Context created by PyCuda? If you're using pycuda.autoinit, it's simply pycuda.autoinit.context. [2] Otherwise, you can always *not* use autoinit and create a context manually. [2] http://documen.tician.de/pycuda/util.html#module-pycuda.autoinit > Perhaps I can pass this Context to my C code as well to ensure both > sets of code are using the same CUDA context? Well, in the CUDA run-time interface (cuda* instead of cu*), there is no such thing as an explicit context object. Instead, this context is implicit in thread state. CUDA 3.0's runtime interface will automatically use an active driver-level context (e.g. one that PyCUDA might've created) without too much fuss. Also, in everything I've seen, launch failures were bugs in code, not issues in PyCUDA--they're essentially GPU segfaults. If you're on Linux and you see messages like this in 'dmesg', that's almost certainly what's happening: Sep 21 11:11:27 teramite kernel: NVRM: Xid (0001:00): 13, 0003 00000000 000050c0 00000368 00000000 00000100 (It might also be that memory got freed behind your back.) Bryan Catanzaro (cc'd) has significant experience gluing run-time and driver code together and might be able to comment on whether he's seen anything sketchy in doing so. In any case, I hope this gets you further. Andreas
pgpn1AYw6SzcC.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda