Just my 2 cents: I consider the split between set_args() and enque() a horrible C-ism - something that should not have entered the design of PyOpenCL IMHO. If anything, the fact that we're having this thread demonstrates that it is not immediately obvious how that code works.
In the Khronos documentation for COPY_HOST_PTR, it explicitly says: "OpenCL implementations are allowed to cache the buffer contents pointed to by host_ptr in device memory. This cached copy can be used when kernels are executed on a device." Enphasis on "are allowed" and "can". So which instruction performs the copy of a COPY_HOST_PTR buffer - the buffer allocation, the set_args, or the enqueue?!? It's left at the discretion of the implementer - with catastrophic results for the portability of your code. I would have rather not used COPY_HOST_PTR and then used an explicit buffer copy at the beginning, followed by pythonic function-style Kernel invocations. Cheers Guido On Oct 07, Blair Azzopardi modulated: > Hi Karl > > I'm not really having a problem running code as you are. From Andreas's > answer to my original post, I am under the belief that my buffer > objects are being resent each time I run a kernel with parameter > arguments. In my own experience, running kernels with device-array inputs and outputs performs as if the data remains on the OpenCL device. It is only if I do to_device() or map_to_host() that I see delays consistent with copying array data back and forth between host and device memory. Regards, Karl _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
