Forwarding off-list reply.
-------- Original Message -------- Subject: Re: [PyCUDA] Contexts and Threading Date: Sat, 29 Sep 2012 19:08:09 +0200 From: Eelco Hoogendoorn <[email protected]> To: Freddie Witherden <[email protected]> Actually; Seems I should RTFM; see the pycuda FAQ Combining threads and streams does not seem to work at all (or I am doing something really stupid). Seems like you need to init the context in the thread, and can not share it between them. At least for the thing I have in mind, creating a context per thread wouldn’t really make sense; a context has a huge overhead, and trying to get multiple contexts to play nicely on the same device at the same time has so far eluded me as well. That is rather disappointing, as it seems there is no way around the hacky state machine stream nonsense, if you want to run a lot of small kernels in parallel (I am thinking millions of calls, each of which would be lucky to saturate a single SMP) Am I missing something? -----Oorspronkelijk bericht----- From: Freddie Witherden Sent: Friday, September 28, 2012 7:26 PM To: [email protected] Subject: [PyCUDA] Contexts and Threading Hello, I have a question regarding how PyCUDA interacts with CUDA 4.x's support for sharing contexts across threads. Broadly speaking I wish to create an analogue of CUDA streams that also support invoking arbitrary Python functions (as opposed to just CUDA kernels and memcpy operations). My idea is to associate a Python thread with each CUDA stream in my application and use a Queue (import Queue) to submit either CUDA kernels or Python functions to the queue with the core code being along the lines of: def queue_worker(q, comm, stream): while True: item = q.get() if item_is_a_cuda_kernel: item(stream=stream) stream.synchronize() elif item_is_a_mpireq: comm.Prequest.startall(item) comm.Prequest.waitall(item) else: item() q.task_done() Allowing one to do: q1, q2 = Queue(), Queue() t1 = Thread(target=queue_worker, args=(q1, comm, a_stream1) t2 = Thread(target=queue_worker, args=(q2, comm, a_stream2) t1.start() t2.start() # Stick items into the queue for the thread to consume However, this is only meaningful if it is possible to share a PyCUDA context between threads. Can someone update me on if this is possible at all (on the CUDA driver level) and if PyCUDA supports this? Regards, Freddie. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
signature.asc
Description: OpenPGP digital signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
