@ptrendx The device-side api I mentioned is the `RandGenerator` class. (the one used in `ndarray.random()`), it generates random number with `curand_uniform()`: https://github.com/apache/incubator-mxnet/blob/master/include/mxnet/random_generator.h#L111
Host api can be seen here (the one I used) https://github.com/apache/incubator-mxnet/blob/master/3rdparty/mshadow/mshadow/random.h#L370 Random numbers are generated with `curandGenerateUniform()` In terms of random number generation, `RandGenerator` (which is basically a wrapper over the CUDA device api, IMO) may be comparable to mshadow/random. However, is it possible that the overhead of _managing random states_ in `RandGenerator` affects its performance ? -- You are receiving this because you are on a team that was mentioned. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-mxnet/issues/15928#issuecomment-522469258