Please see responses inline.

From: NumPy-Discussion [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of 
Todd
Sent: Wednesday, October 26, 2016 4:04 PM
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Intel random number package

On Wed, Oct 26, 2016 at 4:30 PM, Pavlyk, Oleksandr 
<oleksandr.pav...@intel.com<mailto:oleksandr.pav...@intel.com>> wrote:

The module under review, similarly to randomstate package, provides alternative 
basic pseudo-random number generators (BRNGs), like MT2203, MCG31, MRG32K3A, 
Wichmann-Hill. The scope of support differ, with randomstate implementing some 
generators absent in MKL and vice-versa.

Is there a reason that randomstate shouldn't implement those generators?


No, randomstate certainly can implement all the BRNGs implemented in MKL. It is 
at developer’s discretion.



Thinking about the possibility of providing the functionality of this module 
within the framework of randomstate, I find that randomstate implements 
samplers from statistical distributions as functions that take the state of the 
underlying BRNG, and produce a single variate, e.g.:

https://github.com/bashtage/ng-numpy-randomstate/blob/master/randomstate/distributions.c#L23-L26

This design stands in a way of efficient use of MKL, which generates a whole 
vector of variates at a time. This can be done faster than sampling a variate 
at a time by using vectorized instructions.  So I wrote mkl_distributions.cpp 
to provide functions that return a given size vector of sampled variates from 
each supported distribution.

I don't know a huge amount about pseudo-random number generators, but this 
seems superficially to be something that would benefit random number generation 
as a whole independently of whether MKL is used.  Might it be possible to 
modify the numpy implementation to support this sort of vectorized approach?

I also think that adopting vectorized mindset would benefit np.random. For 
example, Gaussians are currently generated using Box-Muller algorithm which 
produces two variate at a time, so one currently needs to be saved in the 
random state struct itself, along with an indicator that it should be used on 
the next iteration.  With vectorized approach one could populate the vector two 
elements at a time with better memory locality, resulting in better performance.

Vectorized approach has merits with or without use of MKL.

Another point already raised by Nathaniel is that for numpy's randomness 
ideally should provide a way to override default algorithm for sampling from a 
particular distribution.  For example RandomState object that implements PCG 
may rely on default acceptance-rejection algorithm for sampling from Gamma, 
while the RandomState object that provides interface to MKL might want to call 
into MKL directly.

The approach that pyfftw uses at least for scipy, which may also work here, is 
that you can monkey-patch the scipy.fftpack module at runtime, replacing it 
with pyfftw's drop-in replacement.  scipy then proceeds to use pyfftw instead 
of its built-in fftpack implementation.  Might such an approach work here?  
Users can either use this alternative randomstate replacement directly, or they 
can replace numpy's with it at runtime and numpy will then proceed to use the 
alternative.

I think the monkey-patching approach will work.

RandomState was written with a view to replace numpy.random at some point in 
the future. It is standalone at the moment, from what I understand, only 
because it is still being worked on and extended.

One particularly important development is the ability to sample continuous 
distributions in floats, or to populate a given preallocated
buffer with random samples. These features are missing from numpy.random_intel 
and we thought it providing them.

As I have said earlier, another missing feature in the C-API for randomness in 
numpy.


Oleksandr
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to