Re: [PyCUDA] PyCuda 3x slower than nvcc

Jonathan WRIGHT Wed, 04 Apr 2012 06:03:10 -0700

Hello,

If you compile with the keep=True option you should find the ptx filegenerated by the compiler, eg:


In [127]: mod = SourceModule(s, keep=True )
*** compiler output in c:\users\wright\appdata\local\temp\tmpvzledt

Over in that folder I find "kernel.ptx" which contains the details ofthe nvcc compiler and options used and the assembler output. If youcompile your C based kernel using nvcc and the -ptx option you should beable to diff the two outputs.

If the ptx files match and the timing still does not then you might wantto try configuring pycuda with --cuda-trace as another way to track downthe differences.


Cheers

Jon

On 04/04/2012 10:39, Michiel Bruinink wrote:

Hello,
I have written a Cuda program that calculates lots of Gauss fits. When I
use that same program with PyCuda, the time it takes to do the
calculations is almost 3x the time it takes with nvcc.
With nvcc it takes 380 ms and with PyCuda it takes 1110 ms, while the
outcome of the calculations is the same.
There is no difference in the device code, because I use the same file
for the device code in both cases.
How is this possible?
Does anybody have an idea?
I am not sure, but could it have someting to do with array declarations
inside a device function?
# define lenP 6
# define nPoints 100000
...
__device__ void someFunction()
{
float residu[nPoints], newResidu[nPoints], pNew[lenP], b[lenP],
deltaP[lenP];
float A[lenP*lenP], Jacobian[nPoints*lenP], B[lenP*lenP];
...
}
Thanks,
Michiel.


_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda


_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] PyCuda 3x slower than nvcc

Reply via email to