On 19/03/13 23:54, ext Peter Colberg wrote:

> If I read the clFinish code correctly, the queue is synchronous, i.e.,
> items in the queue are processed one at a time, and the host thread
> blocks during processing.

Yes, this is indeed the case.

>
> Did you already think about asynchronous/non-blocking processing of
> the queue? This would be useful for computation and memory transfer
> overlap, and CPU+GPU or multi-GPU computation.

Current status is just due to lack of time and/or prioritization. 
Asynchronous (and out-of-order) processing of the queue(s) is certainly 
the intended to be added ... at some point.

This could be done in several ways. One is to have the host code's main 
thread adding command_nodes (better name needed, perhaps :)) to the 
queue, and a background thread eating up the queue, synchronously. One 
consumer thread is probably needed per device/queue.

An alternative is to re-process the queue when the host calls any OCL 
api functions. The OCL device would then need a system to asynchronously 
notify the host of what has finished since the last re-processing. The 
current implementation is a degenerate form of this idea. Somehow I 
favour the thread-implementation alternative over this.


kalle

-- 
But beware the debugger. Dark side of the source it is.
If once you start down the dark path, forever will it dominate
your destiny. Consume you it will.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to