Dnia 2011-01-18, wto o godzinie 09:46 -0500, Martin Laprise pisze:
> Hi, I just made some experiments with the CURAND wrappers. It seem to work
> very nicely except for a little detail that I can't figure out. The
> initialization of the generator and the actual random number generation seem
> very fast. But for what ever reason, PyCUDA take a long time to "recover"
> after the number generation. This pause is significantly longer than the
> actual computation and the delay increase with N. Here is an example:
> 

curand kernels are called asynchronously.
This means that PyCUDA returns immediately after
initiating the call, and does not wait for result.
This allows hardware or drive to better manage
order of execution, and to run many kernels concurrently
on modern hardware (2.x capabilities).

After changing your code to force PyCUDA to wait I got
following results:

import numpy as np
import pycuda.autoinit
import pycuda.gpuarray
from pycuda.curandom import PseudoRandomNumberGenerator,
QuasiRandomNumberGenerator
import cProfile
import time as clock


cuda_stream = pycuda.driver.Stream()

def curand_prof():

    N = 100000000

    t1 = clock.time()
    # GPU
    rr = PseudoRandomNumberGenerator(0,
np.random.random(128).astype(np.int32))
    data = pycuda.gpuarray.empty([N], np.float32)
    rr.fill_normal_float(data.gpudata, N, stream=cuda_stream)
    cuda_stream.synchronize()
    t2 = clock.time()
    print "Bench 1: " + str(t2-t1) + " sec"


if __name__ == "__main__":
    t4 = clock.time()
    curand_prof()
    t5 = clock.time()
    print "Bench 2: " + str(t5-t4) + " sec"

Bench 1: 1.15405488014 sec
Bench 2: 1.15947508812 sec

It seems consistent with your results - I was running on GTX 460
with Fermi. Your GTX 260 is Tesla, so 256 threads are used;
Fermi uses 1024 threads, which uses 4 times less time to compute
random numbers.

Best regards, thanks for noticing this, and thanks for testing
CURAND wrapper.

-- 
Tomasz Rybak <[email protected]> GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to