On Wed, Apr 04, 2012 at 04:35:21PM +0300, Tomi Pieviläinen wrote:
> Hi all,
> 
> after profiling my script, pstats shows that significant amount of
> time is spent in a function that only calls
> pycuda.autoinit.context.synchronize() and few cuda kernel calls
> depending on a setting (different calls in brances of if).
> 
> Is it really possible, that most of the time is spent processing that
> if, or are syncing or gpu-function calls somehow skipped in cProfile?
> The relevant line from pstats is
> 
> 239528 1065.698    0.004 1089.941    0.005 translocation.py:308(forces)

Well, I ran it through the line_profiler, and it seems like 99% of the
time is spent on synchronization calls. Weird that the syncthreads
within the kernel code doesn't cause much delays (I'm running only one
block, so they should be equivalent, right?).

-- 
Tomi Pieviläinen, +358 400 487 504
A: Because it disrupts the natural way of thinking.
Q: Why is top posting frowned upon?

Attachment: signature.asc
Description: Digital signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to