The problem seem to be different from the one explained in the link....

On Tue, Jan 18, 2011 at 9:46 AM, Martin Laprise <[email protected]>wrote:

> Hi, I just made some experiments with the CURAND wrappers. It seem to work
> very nicely except for a little detail that I can't figure out. The
> initialization of the generator and the actual random number generation seem
> very fast. But for what ever reason, PyCUDA take a long time to "recover"
> after the number generation. This pause is significantly longer than the
> actual computation and the delay increase with N. Here is an example:
>
>
> import numpy as np
> import pycuda.autoinit
> import pycuda.gpuarray
> from pycuda.curandom import PseudoRandomNumberGenerator,
> QuasiRandomNumberGenerator
> import cProfile
> import time as clock
>
>
> def curand_prof():
>
>     N = 100000000
>
>     t1 = clock.time()
>     # GPU
>     rr = PseudoRandomNumberGenerator(0,
> np.random.random(128).astype(np.int32))
>     data = pycuda.gpuarray.zeros([N], np.float32)
>     rr.fill_normal_float(data.gpudata, N)
>     t2 = clock.time()
>     print "Bench 1: " + str(t2-t1) + " sec"
>
>
> if __name__ == "__main__":
>     t1 = clock.time()
>     curand_prof()
>     t2 = clock.time()
>     print "Bench 2: " + str(t2-t1) + " sec"
>
>
> Here is the actual output with a GTX 260 gpu:
> Bench 1: 0.0117599964142 sec
> Bench 2: 4.40562295914 sec
>
> In the example, the pause have no consequence, but if I want to use the
> random matrix in an other kernel ... it's quite a delay. I've made some
> research and my guess is that the problem is linked to this already reported
> problem here:
>
> http://forums.nvidia.com/index.php?showtopic=185740
>
> Anyone knows how we can implement the solution to the wrapper ?
>
> Martin
>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to