Freddie Witherden <[email protected]> writes:

> Hi Andreas,
>
> On 19/01/14 03:06, Andreas Kloeckner wrote:
>> ALLOC_HOST_PTR with enqueue_map_buffer will give you page-locked memory
>> (and thus fast transfers) on Nvidia and AMD. You very likely do *not*
>> want to pass this buffer to any kernels though--just use it as a
>> transfer target.
>
> Looking at the CL documentation for clEnqueueMapBuffer:
>
>   The behavior of OpenCL function calls that enqueue commands that
>   write or copy to regions of a memory object that are mapped is
>   undefined.
>
> which is problematic as to create a persistent MPI request one needs a
> pointer which does not change.  Hence, one would need to keep the buffer
> mapped at all times.  However, doing so prevents one from copying
> to/from the buffer.  (At least as far as I can discern.)

OIC. That said, I'd suspect that the perf gain from the page-locked
transfer is likely higher than from the persistent MPI request, but I
might of course be wrong.

Andreas

Attachment: pgpOzLWmhsDcJ.pgp
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to