Blair Azzopardi <[email protected]> writes:
> How would I execute multiple kernels sequentially without needing to resend
> data to the GPU.
>
> I'm thinking it would be something along the lines of:
>
> data1_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=..)
> data2_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=..)
>
> dataX_g = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR,
> hostbuf=local_var)
>
> my_kernel1.set_args(0, data1_g, data2_g, .., dataX_g)
> my_kernel2.set_args(0, data1_g, data2_g, .., dataX_g)
> my_kernel2.set_args(0, data1_g, data2_g, .., dataX_g)
>
> cl.enqueue_nd_range_kernel(queue, my_kernel1, global_ws, local_ws)
> cl.enqueue_nd_range_kernel(queue, my_kernel2, global_ws, local_ws)
> cl.enqueue_task(queue, my_kernel3)
>
> cl.enqueue_copy(queue, local_var, z_g)
>
> Does that look right?

Yep.

> Does one need to set_args for each kernel or is there
> a device specific method for setting global memory?

Neither.

> Perhaps there's an
> example somewhere?

https://github.com/pyopencl/pyopencl/tree/master/examples

Andreas


_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to