Hi David,

First of all, I noticed I put a wrong title to my mail : the aditionnal
memory transfert is from device to host.

The buffer d_c_buf is read back, in the file matrix-multiply.py in the
examples packaged with pyopencl. As you will see, it's size is 2 times
smaller than the size of d_a_buf and d_b_buf.

The 2 memcpyHtoDasync of 512 MB are the transfert of the matrix A and B
(filling d_a_buf and d_b_buf) and there should be one
memcpyDtoHasync of 256 MB (reading back d_c_buf to get the matrix C).

It seems the 3 buffer d_a_buf, d_b_buf and d_c_buf are read back another
time.

Regards,
Nicolas


David Garcia a écrit :
Hi Nicolas,

What are the parameters that you pass to enqueue_read_buffer_call? In particular, how many megabytes are you reading back?

I notice that the call to memcpyHtoDasync is 512MB whereas the two memcpyDtoHasync are 256MB each.

Cheers,

David

On Tue, Feb 2, 2010 at 11:08 AM, Bonnel <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I was just playing with the profiler from nvidia and I'm wondering
    why all data from the graphic card are read back. I though memory
    was read back only when using cl.enqueue_read_buffer. Here is the
    result I get from the profiling of matrix-multiply.py :

    method                        memory transfert size
memcpyHtoDasync 5.12e+06 memcpyHtoDasync 5.12e+06 memcpyDtoHasync 2.56e+06 memcpyDtoHasync 5.12e+06 memcpyDtoHasync 2.56e+06 memcpyDtoHasync 5.12e+06 As there is only one cl.enqueue_read_buffer call, there should be
    only one memcpyDtoHasync call.

    Regards,
    Nicolas Bonnel

    _______________________________________________
    PyOpenCL mailing list
    [email protected]
    <mailto:[email protected]>
    http://host304.hostmonster.com/mailman/listinfo/pyopencl_tiker.net





_______________________________________________
PyOpenCL mailing list
[email protected]
http://host304.hostmonster.com/mailman/listinfo/pyopencl_tiker.net

Reply via email to