On Mon, 21 Mar 2011 19:55:31 +0100, Magnus Paulsson <paulsso...@gmail.com> 
wrote:
> > Wild theory: Maybe the print statements introduce GPU synchronization?
> > Does your observation change with multiple loops through the code?
> >
> > Also note that the profiler won't help you debug overlap. If it is
> > active, all GPU activity is synchronous.
> >
> > Andreas
> 
> No. None of the above. The "Working.py" code runs overlapping using
> the profiler including print statments.

CUDA 4.0 programming guide, 3.2.5.1:

"When an application is run via a CUDA debugger or profiler (cuda-gdb, CUDA
Visual Profiler, Parallel Nsight), all launches are synchronous."

(and that sentence has been around for a few versions)

Either you are or that sentence is wrong. :)

Andreas

Attachment: pgpZFl3cvbeEP.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to