Forwarding off-list reply. -------- Original Message -------- Subject: Re: [PyCUDA] Contexts and Threading Date: Sat, 29 Sep 2012 17:57:43 +0200 From: Eelco Hoogendoorn <[email protected]> To: Freddie Witherden <[email protected]>
That is an interesting thought; I have been thinking about a similar design, where the aim is to cleanly execute algorithms that are embarrassingly parralel, in the sense of not requiring any inter-stream communication. Indeed it would seem to me that python threads should be a good fit for this type of application. One process per device and one thread per stream seems to be a natural match, given the implementation of those concepts in python. Don’t know what kind of issues youd run into though; best to juststart trying and see, but for what its worth; I imagine a design pattern with anabstract subclass of thread, which creates and holds a cuda stream. You could then implement youd algo in a subclass thereof, and thatd be fairly clean. What bothers me though is that the overloaded gpuarray operators do not support stream arguments; I cant really think of an elegant way to solve that, and I suppose there are lots of problems of that nature if you start digging deeper. -----Oorspronkelijk bericht----- From: Freddie Witherden Sent: Friday, September 28, 2012 7:26 PM To: [email protected] Subject: [PyCUDA] Contexts and Threading Hello, I have a question regarding how PyCUDA interacts with CUDA 4.x's support for sharing contexts across threads. Broadly speaking I wish to create an analogue of CUDA streams that also support invoking arbitrary Python functions (as opposed to just CUDA kernels and memcpy operations). My idea is to associate a Python thread with each CUDA stream in my application and use a Queue (import Queue) to submit either CUDA kernels or Python functions to the queue with the core code being along the lines of: def queue_worker(q, comm, stream): while True: item = q.get() if item_is_a_cuda_kernel: item(stream=stream) stream.synchronize() elif item_is_a_mpireq: comm.Prequest.startall(item) comm.Prequest.waitall(item) else: item() q.task_done() Allowing one to do: q1, q2 = Queue(), Queue() t1 = Thread(target=queue_worker, args=(q1, comm, a_stream1) t2 = Thread(target=queue_worker, args=(q2, comm, a_stream2) t1.start() t2.start() # Stick items into the queue for the thread to consume However, this is only meaningful if it is possible to share a PyCUDA context between threads. Can someone update me on if this is possible at all (on the CUDA driver level) and if PyCUDA supports this? Regards, Freddie. _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
signature.asc
Description: OpenPGP digital signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
