Dnia 2011-05-06, pią o godzinie 08:30 -0400, Andreas Kloeckner pisze: > On Fri, 06 May 2011 11:00:08 +0200, Tomasz Rybak <bogom...@post.pl> wrote: > > Dnia 2011-04-24, nie o godzinie 17:50 +0200, Tomasz Rybak pisze: > > > Dnia 2011-04-24, nie o godzinie 01:22 -0400, Andreas Kloeckner pisze: > > > > On Sun, 17 Apr 2011 17:10:30 +0200, "=?UTF-8?B?VG9tYXN6IFJ5YmFr?=" > > > > <bogom...@post.pl> wrote: > > > > > Dnia 2011-04-16 00:48 Andreas Kloeckner napisał(a): > > > > > > > > > > >Hi Tomasz, > > > > > > > > > > > >any progress on the things below? Has maybe another email gone > > > > > >missing? > > > > > >I'd like to release PyCUDA 2011.1 soon. > > > > > > > > > > > > > > > I have send the patch to the mailing list on 2011-03-18 and > > > > > 2011-03-24. > > > > > Please check the archives. > > > > > > > > Weird. I'm really not sure what happened there--I could've sworn I never > > > > saw these emails. > > > > > > > > In any case, I've merged your patch into the > > > > curand-wrapper-v2-from-tomasz branch, which I've also brought up to > > > > current master. > > > > > > Thanks! > > > > > > > > > > > Looks like we're almost done here--only initialization is missing. > > > > > > I hope to have it done by the end of the next week. > > > > I have troubles with transferring data between C and Python. > > I want to: > > 1. call curandGetDirectionVectors32 which returns pointer > > to the 32 int32 > > 2. transfer this data to the device memory > > 3. create GPUArray from it > > 4. call prepare() kernel which will pass appropriate > > direction vectors to the curand_init() kernel > > > > I have tried two approaches (curand-hostptr.diff > > and curand-memcpy.diff). The former, when compiled, > > causes _curand module to misbehave - e.g. it get_curand_version() > > returns None instead of (3, 2, 0) or (0, 0, 0) > > The latter compiles but cuMemcpy call fails because of > > parameter type mismatch. > > > > Can someone give me some direction how to transfer raw > > pointer from C to Python? > > What's wrong with creating a numpy array for the data?
Thanks for the tip. Here is the patch. It works, and generates quasi-random numbers. The only problem is when with calling curandGetDirectionVectors - it generates only 20000 vectors, so we will have repeated generators when we have GPUs with more than 20000 cores ;-) BTW - there is difference in wrap_cudadrv.cpp between master and curand branch. I had to fix it but did not include this change in the patch. Please apply this patch - and I believe that we could think about merging curand branch into master. Best regards. -- Tomasz Rybak <bogom...@post.pl> GPG/PGP key ID: 2AD5 9860 Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860 http://member.acm.org/~tomaszrybak
diff --git a/doc/source/array.rst b/doc/source/array.rst index 4dabe59..431aa37 100644 --- a/doc/source/array.rst +++ b/doc/source/array.rst @@ -399,6 +399,21 @@ algorithm designed to fill an n-dimensional space evenly. Quasirandom numbers are more expensive to generate. +.. function:: get_curand_version() + + Obtain the version of CURAND against which PyCUDA was compiled. Returns a + 3-tuple of integers as *(major, minor, revision)*. + +.. function:: seed_getter_uniform(N) + + Return an :class:`GPUArray` filled with one random `int32` repeated `N` + times which can be used as a seed for XORWOW generator. + +.. function:: seed_getter_unique(N) + + Return an :class:`GPUArray` filled with `N` random `int32` which can + be used as a seed for XORWOW generator. + .. class:: XORWOWRandomNumberGenerator(seed_getter=None, offset=0) :arg seed_getter: a function that, given an integer count, will yield an @@ -432,7 +447,18 @@ Quasirandom numbers are more expensive to generate. Accepts array i of integer values, telling each generator how many values to skip. -.. class:: Sobol32RandomNumberGenerator(vector, offset) +.. function:: generate_direction_vectors(count, direction=direction_vector_set.VECTOR_32) + + Return an :class:`GPUArray` `count` filled with direction vectors + used to initialize Sobol32 generators. + +.. class:: Sobol32RandomNumberGenerator(dir_vector=None, offset=0) + + :arg dir_vector: a :class:`GPUArray` of 32-element `int32` vectors which + are used to initialize quasirandom generator; it must contain one vector + for each initialized generator + :arg offset: Starting index into the Sobol32 sequence, given direction + vector. Provides quasirandom numbers. Generates sequences with period of :math:`2^32`. diff --git a/pycuda/curandom.py b/pycuda/curandom.py index 73ca18d..5982b33 100644 --- a/pycuda/curandom.py +++ b/pycuda/curandom.py @@ -255,7 +255,7 @@ else: if get_curand_version() >= (3, 2, 0): direction_vector_set = _curand.direction_vector_set - get_direction_vectors32 = _curand.get_direction_vectors32 + _get_direction_vectors = _curand._get_direction_vectors # {{{ Base class @@ -526,9 +526,14 @@ if get_curand_version() >= (3, 2, 0): # {{{ Sobol32 RNG +def generate_direction_vectors(count, direction=direction_vector_set.VECTOR_32): + result = np.empty((count, 32), dtype=np.int32) + _get_direction_vectors(direction, result, count) + return pycuda.gpuarray.to_gpu(result) + sobol32_random_source = """ extern "C" { -__global__ void prepare(curandStateSobol32 *s, const int n, unsigned int **v, +__global__ void prepare(curandStateSobol32 *s, const int n, curandDirectionVectors32_t *v, const unsigned int o) { const int id = blockIdx.x*blockDim.x+threadIdx.x; @@ -578,10 +583,10 @@ if get_curand_version() >= (3, 2, 0): dev = drv.Context.get_device() if dev.compute_capability() >= (2, 0): p.prepared_call((self.block_count, 1), self.state, - self.block_count * self.generators_per_block, vector, offset) + self.block_count * self.generators_per_block, dir_vector.gpudata, offset) else: p.prepared_call((2 * self.block_count, 1), self.state, - self.block_count * self.generators_per_block // 2, vector, offset) + self.block_count * self.generators_per_block // 2, dir_vector.gpudata, offset) except drv.LaunchError: raise ValueError("Initialisation failed. Decrease number of threads.") diff --git a/src/cpp/curand.hpp b/src/cpp/curand.hpp index e025b0c..9783239 100644 --- a/src/cpp/curand.hpp +++ b/src/cpp/curand.hpp @@ -3,7 +3,28 @@ #if CUDAPP_CUDA_VERSION >= 3020 -#include <curand.h> + #include <curand.h> + + #ifdef CUDAPP_TRACE_CUDA + #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) \ + if (CODE != CURAND_STATUS_SUCCESS) \ + std::cerr << NAME << " failed with code " << CODE << std::endl; + #else + #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) /*nothing*/ + #endif + + #define CURAND_CALL_GUARDED(NAME, ARGLIST) \ + { \ + CUDAPP_PRINT_CALL_TRACE(#NAME); \ + curandStatus_t cu_status_code; \ + cu_status_code = NAME ARGLIST; \ + CURAND_PRINT_ERROR_TRACE(#NAME, cu_status_code); \ + if (cu_status_code != CURAND_STATUS_SUCCESS) \ + throw pycuda::error(#NAME, CUDA_SUCCESS);\ + } +#else + #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) /*nothing*/ + #define CURAND_CALL_GUARDED(NAME, ARGLIST) /*nothing*/ #endif @@ -22,10 +43,26 @@ namespace pycuda { namespace curandom { } #if CUDAPP_CUDA_VERSION >= 3020 - void py_curand_get_direction_vectors32(curandDirectionVectors32_t *vectors[], - curandDirectionVectorSet_t set) -// TODO: checking; cannot use CUDAPP_CALL_GUARDED because function returns CURAND enum - { curandGetDirectionVectors32(vectors, set); } + void py_curand_get_direction_vectors( + curandDirectionVectorSet_t set, py::object dst, int count) + { + void *buf; + PYCUDA_BUFFER_SIZE_T len; + int n = 0; + + if (PyObject_AsWriteBuffer(dst.ptr(), &buf, &len)) + throw py::error_already_set(); + if (CURAND_DIRECTION_VECTORS_32_JOEKUO6 == set) { + curandDirectionVectors32_t *vectors; + CURAND_CALL_GUARDED(curandGetDirectionVectors32, (&vectors, set)); + while (count > 0) { + int size = ((count > 20000) ? 20000 : count)*sizeof(curandDirectionVectors32_t); + memcpy((int *)buf+n*20000*sizeof(curandDirectionVectors32_t)/sizeof(unsigned int), vectors, size); + count -= size/sizeof(curandDirectionVectors32_t); + n++; + } + } + } #endif } } diff --git a/src/wrapper/wrap_curand.cpp b/src/wrapper/wrap_curand.cpp index 71adf57..2c1f2e3 100644 --- a/src/wrapper/wrap_curand.cpp +++ b/src/wrapper/wrap_curand.cpp @@ -25,7 +25,8 @@ void pycuda_expose_curand() py::def("get_curand_version", py_curand_version); #if CUDAPP_CUDA_VERSION >= 3020 - py::def("get_direction_vectors32", py_curand_get_direction_vectors32, (arg("vectors"), arg("set"))); + py::def("_get_direction_vectors", py_curand_get_direction_vectors, + (arg("set"), arg("dst"), arg("count"))); #endif }
signature.asc
Description: This is a digitally signed message part
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda