Hi David,
First of all, I noticed I put a wrong title to my mail : the aditionnal
memory transfert is from device to host.
The buffer d_c_buf is read back, in the file matrix-multiply.py in the
examples packaged with pyopencl. As you will see, it's size is 2 times
smaller than the size of d_a_buf and d_b_buf.
The 2 memcpyHtoDasync of 512 MB are the transfert of the matrix A and B
(filling d_a_buf and d_b_buf) and there should be one
memcpyDtoHasync of 256 MB (reading back d_c_buf to get the matrix C).
It seems the 3 buffer d_a_buf, d_b_buf and d_c_buf are read back another
time.
Regards,
Nicolas
David Garcia a écrit :
Hi Nicolas,
What are the parameters that you pass to enqueue_read_buffer_call? In
particular, how many megabytes are you reading back?
I notice that the call to memcpyHtoDasync is 512MB whereas the two
memcpyDtoHasync are 256MB each.
Cheers,
David
On Tue, Feb 2, 2010 at 11:08 AM, Bonnel <[email protected]
<mailto:[email protected]>> wrote:
Hi,
I was just playing with the profiler from nvidia and I'm wondering
why all data from the graphic card are read back. I though memory
was read back only when using cl.enqueue_read_buffer. Here is the
result I get from the profiling of matrix-multiply.py :
method memory transfert size
memcpyHtoDasync 5.12e+06
memcpyHtoDasync 5.12e+06
memcpyDtoHasync 2.56e+06
memcpyDtoHasync 5.12e+06
memcpyDtoHasync 2.56e+06
memcpyDtoHasync 5.12e+06
As there is only one cl.enqueue_read_buffer call, there should be
only one memcpyDtoHasync call.
Regards,
Nicolas Bonnel
_______________________________________________
PyOpenCL mailing list
[email protected]
<mailto:[email protected]>
http://host304.hostmonster.com/mailman/listinfo/pyopencl_tiker.net
_______________________________________________
PyOpenCL mailing list
[email protected]
http://host304.hostmonster.com/mailman/listinfo/pyopencl_tiker.net