Barry Smith via petsc-dev <petsc-dev@mcs.anl.gov> writes:

>   The PetscLogGpuTimeBegin()/End was written by Hong so it works with events 
> to get a GPU timing, it is not suppose to include the CPU kernel launch times 
> or the time to move the scalar arguments to the GPU. It may not be perfect 
> but it is the best we can do to capture the time the GPU is actively doing 
> the numerics, which is what we want.

As we discussed at the time, collecting the results can be asynchronous and 
this would be useful to reduce the negative impact of profiling on end-to-end 
performance.

But I think what's proposed here is okay because PetscLogGpuTimeBegin() starts 
counting when the device reaches that point, not when it's given on the host.

Reply via email to