Forwarding off-list reply.

-------- Original Message --------
Subject: Re: [PyCUDA] Contexts and Threading
Date: Sat, 29 Sep 2012 17:57:43 +0200
From: Eelco Hoogendoorn <[email protected]>
To: Freddie Witherden <[email protected]>


That is an interesting thought; I have been thinking about a similar
design,
where the aim is to cleanly execute algorithms that are embarrassingly
parralel, in the sense of not requiring any inter-stream communication.

Indeed it would seem to me that python threads should be a good fit for
this
type of application. One process per device and one thread per stream seems
to be a natural match, given the implementation of those concepts in python.

Don’t know what kind of issues youd run into though; best to juststart
trying and see, but for what its worth; I imagine a design pattern with
anabstract subclass of thread, which creates and holds a cuda stream. You
could then implement youd algo in a subclass thereof, and thatd be fairly
clean.

What bothers me though is that the overloaded gpuarray operators do not
support stream arguments; I cant really think of an elegant way to solve
that, and I suppose there are lots of problems of that nature if you start
digging deeper.


-----Oorspronkelijk bericht-----
From: Freddie Witherden
Sent: Friday, September 28, 2012 7:26 PM
To: [email protected]
Subject: [PyCUDA] Contexts and Threading

Hello,

I have a question regarding how PyCUDA interacts with CUDA 4.x's
support for sharing contexts across threads.

Broadly speaking I wish to create an analogue of CUDA streams that
also support invoking arbitrary Python functions (as opposed to just
CUDA kernels and memcpy operations).

My idea is to associate a Python thread with each CUDA stream in my
application and use a Queue (import Queue) to submit either CUDA
kernels or Python functions to the queue with the core code being
along the lines of:

def queue_worker(q, comm, stream):
   while True:
       item = q.get()
       if item_is_a_cuda_kernel:
           item(stream=stream)
           stream.synchronize()
       elif item_is_a_mpireq:
           comm.Prequest.startall(item)
           comm.Prequest.waitall(item)
       else:
           item()
       q.task_done()

Allowing one to do:
   q1, q2 = Queue(), Queue()
   t1 = Thread(target=queue_worker, args=(q1, comm, a_stream1)
   t2 = Thread(target=queue_worker, args=(q2, comm, a_stream2)
   t1.start()
   t2.start()

   # Stick items into the queue for the thread to consume


However, this is only meaningful if it is possible to share a PyCUDA
context between threads.  Can someone update me on if this is possible
at all (on the CUDA driver level) and if PyCUDA supports this?

Regards, Freddie.

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda




Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to