Hi,

I was hoping to get some insight on my observations. I am using PyOpenCL
version 2 with NVIDIA Tesla M2090 to run my kernel which runs SHA1
algorithm over variably sized data blocks. I'm running the same kernel  I'm
trying to find the execution time for my kernel. But I'm getting different
readings for time for when I use the PyOpenCL's profiling tool and when I
use the standard python time library. My code is structured as:


hash_start = time.time()
hash_event = prog.sha1( queue , shape , None , in_buf , out_buf , ..<other
buffers> )
hash_event.wait()
hash_end = time.time()
add_hash_CPU_time( hash_end - hash_start )
add_hash_GPU_time( 1e-9 * ( hash_event.profile.end -
hash_event.profile.start ) )

These are the results for a test case of size 3 GB. The kernel gets called
64 times and runs 12288 threads each time.

Total OpenCL profiling time = 1.56s
Total CPU wall clock time = 13.79s

I needed some help understanding what the cause for this inconsistency is.
Or is there any mistake I'm making in recording the data.

Regards,
Abhilash Dighe
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to