[PyCUDA] Freezes in cudaCtxSynchronize when multithreading with multiple GPUs

David Eklund Wed, 06 Jun 2012 12:19:23 -0700

We have a persistent problem attempting to multithread using pycuda. I have
a thread pool with one thread per GPU, each one initializes its own context
with its given device ID and waits to read jobs from a common Queue object.
The main thread processes requests and adds CUDA related jobs to the Queue.
This works well enough and utilizes all available GPUs but we frequently
run into a locking issue when issuing lots of relatively fast cuda calls
where one computation will hang indefinitely. When the contexts are created
with the pycuda.driver.ctx_flags.SCHED_BLOCKING_SYNC flag and I attach to a
hung process I find it's waiting on a semaphore in cuCtxSynchronize in
libcuda.so; when the contexts are created without the SCHED_BLOCKING_SYNC
flag I find its still stuck in cuCtxSynchronize but in a spin loop waiting
for results.


I have an alternative version with all the same code but bypassing pycuda
and calling directly into an nvcc compiled shared library using ctypes that
uses cudaSetDevice and cudaDeviceSynchronize rather than the cuCtx*
functions and it does not experience these same locking issues.

Has anyone ran into this kind of issue before? Also, is there support in
pycuda (or planned support for future releases) to use cudaDevice*
functions rather than explicit context management?

David

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

[PyCUDA] Freezes in cudaCtxSynchronize when multithreading with multiple GPUs

Reply via email to