Hi, I was hoping to get some insight on my observations. I am using PyOpenCL version 2 with NVIDIA Tesla M2090 to run my kernel which runs SHA1 algorithm over variably sized data blocks. I'm running the same kernel I'm trying to find the execution time for my kernel. But I'm getting different readings for time for when I use the PyOpenCL's profiling tool and when I use the standard python time library. My code is structured as:
hash_start = time.time() hash_event = prog.sha1( queue , shape , None , in_buf , out_buf , ..<other buffers> ) hash_event.wait() hash_end = time.time() add_hash_CPU_time( hash_end - hash_start ) add_hash_GPU_time( 1e-9 * ( hash_event.profile.end - hash_event.profile.start ) ) These are the results for a test case of size 3 GB. The kernel gets called 64 times and runs 12288 threads each time. Total OpenCL profiling time = 1.56s Total CPU wall clock time = 13.79s I needed some help understanding what the cause for this inconsistency is. Or is there any mistake I'm making in recording the data. Regards, Abhilash Dighe
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
