Freddie Witherden <[email protected]> writes: > Hi Andreas, > > On 19/01/14 03:06, Andreas Kloeckner wrote: >> ALLOC_HOST_PTR with enqueue_map_buffer will give you page-locked memory >> (and thus fast transfers) on Nvidia and AMD. You very likely do *not* >> want to pass this buffer to any kernels though--just use it as a >> transfer target. > > Looking at the CL documentation for clEnqueueMapBuffer: > > The behavior of OpenCL function calls that enqueue commands that > write or copy to regions of a memory object that are mapped is > undefined. > > which is problematic as to create a persistent MPI request one needs a > pointer which does not change. Hence, one would need to keep the buffer > mapped at all times. However, doing so prevents one from copying > to/from the buffer. (At least as far as I can discern.)
OIC. That said, I'd suspect that the perf gain from the page-locked transfer is likely higher than from the persistent MPI request, but I might of course be wrong. Andreas
pgpOzLWmhsDcJ.pgp
Description: PGP signature
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
