[PyCUDA] Fwd: Re: Contexts and Threading

Freddie Witherden Sun, 30 Sep 2012 12:18:36 -0700

Forwarding off-list reply.

-------- Original Message --------
Subject: Re: [PyCUDA] Contexts and Threading
Date: Sat, 29 Sep 2012 19:08:09 +0200
From: Eelco Hoogendoorn <[email protected]>
To: Freddie Witherden <[email protected]>

Actually; Seems I should RTFM; see the pycuda FAQ
Combining threads and streams does not seem to work at all (or I am doing
something really stupid). Seems like you need to init the context in the
thread, and can not share it between them.

At least for the thing I have in mind, creating a context per thread
wouldn’t
really make sense; a context has a huge overhead, and trying to get
multiple
contexts to play nicely on the same device at the same time has so far
eluded me as well.

That is rather disappointing, as it seems there is no way around the hacky
state machine stream nonsense, if you want to run a lot of small kernels in
parallel (I am thinking millions of calls, each of which would be lucky to
saturate a single SMP)

Am I missing something?

-----Oorspronkelijk bericht-----
From: Freddie Witherden
Sent: Friday, September 28, 2012 7:26 PM
To: [email protected]
Subject: [PyCUDA] Contexts and Threading

Hello,

I have a question regarding how PyCUDA interacts with CUDA 4.x's
support for sharing contexts across threads.

Broadly speaking I wish to create an analogue of CUDA streams that
also support invoking arbitrary Python functions (as opposed to just
CUDA kernels and memcpy operations).

My idea is to associate a Python thread with each CUDA stream in my
application and use a Queue (import Queue) to submit either CUDA
kernels or Python functions to the queue with the core code being
along the lines of:

def queue_worker(q, comm, stream):
   while True:
       item = q.get()
       if item_is_a_cuda_kernel:
           item(stream=stream)
           stream.synchronize()
       elif item_is_a_mpireq:
           comm.Prequest.startall(item)
           comm.Prequest.waitall(item)
       else:
           item()
       q.task_done()

Allowing one to do:
   q1, q2 = Queue(), Queue()
   t1 = Thread(target=queue_worker, args=(q1, comm, a_stream1)
   t2 = Thread(target=queue_worker, args=(q2, comm, a_stream2)
   t1.start()
   t2.start()

   # Stick items into the queue for the thread to consume

However, this is only meaningful if it is possible to share a PyCUDA
context between threads.  Can someone update me on if this is possible
at all (on the CUDA driver level) and if PyCUDA supports this?

Regards, Freddie.

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

signature.asc
Description: OpenPGP digital signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

[PyCUDA] Fwd: Re: Contexts and Threading

Reply via email to