Re: [PyOpenCL] Execute multiple kernels in sequence

CRV§ADER//KY Tue, 06 Oct 2015 23:41:20 -0700

Just my 2 cents:
I consider the split between set_args() and enque() a horrible C-ism -
something that should not have entered the design of PyOpenCL IMHO.
If anything, the fact that we're having this thread demonstrates that it is
not immediately obvious how that code works.


In the Khronos documentation for COPY_HOST_PTR, it explicitly says:
"OpenCL implementations are allowed to cache the buffer contents pointed to
by host_ptr in device memory. This cached copy can be used when kernels are
executed on a device."

Enphasis on "are allowed" and "can". So which instruction performs the copy
of a COPY_HOST_PTR buffer - the buffer allocation, the set_args, or the
enqueue?!? It's left at the discretion of the implementer - with
catastrophic results for the portability of your code.

I would have rather not used COPY_HOST_PTR and then used an explicit buffer
copy at the beginning, followed by pythonic function-style Kernel
invocations.

Cheers
Guido
On Oct 07, Blair Azzopardi modulated:
> Hi Karl
>
> I'm not really having a problem running code as you are. From Andreas's
> answer to my original post, I am under the belief that my buffer
> objects are being resent each time I run a kernel with parameter
> arguments.


In my own experience, running kernels with device-array inputs and
outputs performs as if the data remains on the OpenCL device.  It is
only if I do to_device() or map_to_host() that I see delays consistent
with copying array data back and forth between host and device memory.

Regards,

Karl


_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Re: [PyOpenCL] Execute multiple kernels in sequence

Reply via email to