Hello all,
A request for feedback on the design of this code before I put it through some
serious testing and submit a formal pull request to Phobos' std.random.
The goal here is to provide an interface that is close to the Boost/C++11 design
while allowing for some extra interesting possibilities. In particular the goal
is to allow the possibility of different underlying generation algorithms or
"engines" that can serve different requirements regarding speed, precision,
memory consumption etc. See:
http://www.cse.cuhk.edu.hk/%7Ephwl/mt/public/archives/papers/grng_acmcs07.pdf
... for further details.
The present code contains both a function and struct interface for normal random
number generation; the latter stores the parameters of the distribution (mean
and standard deviation) and an instance of the underlying engine, but not the
random number generator.
(An alternative design choice for the struct would have been to design it as a
range with front/popFront, but for this to work I'd have had to also have it
store the RNG, and this would run into the same problems as RandomSample has due
to RNGs not being reference types. Boost/C++11 also do not store the RNG
internally but implement a variate_generator class which allows the coupling of
a random-number distribution and an underlying generator; this is the approach
I'd take for defining say a RandomVariate range.)
I've so far implemented one underlying engine, the Box-Muller method currently
used in Boost. This is not the best engine possible -- an implementation of the
Ziggurat algorithm is desirable -- but it represents what is typically used in
other libraries. The essence of Box-Muller is that it generates
(uniformly-distributed) numbers a pair at a time and then uses those to generate
a corresponding pair of normally-distributed numbers. The method is described here:
https://en.wikipedia.org/wiki/Box-Muller_transform
I've stuck very closely to the implementation in Boost, to the point of
reproducing a "trick" Boost uses to get round the fact that its uniform random
number generator only supports the interval [a, b) whereas the canonical
description of Box-Muller requires the random numbers to be distributed in the
interval (0, 1]. Obviously in D this is avoidable, but I felt there was some
advantage in reproducing identical results to Boost, at least within the limits
of floating-point numerical rounding (and maybe also because D's PI variable
goes to a greater number of decimal places).
From a licensing point of view this close reproduction isn't an issue, as Boost
and Phobos both use the same licence, but there'll be a credit to the original
Boost authors in any pull request. However, if there's a desire to
differentiate the code more strongly, that can be arranged :-) I've attached a
small C++ code example to demonstrate the common output for the D version and
that of Boost.
Anyway, what I'd really appreciate feedback on is the interface design for these
functions and structs. A few concrete questions:
(1) Does the order of template arguments seem optimal? My inclination is
perhaps to tweak it so that the Uniform RNG type goes first, the underlying
engine second, and the input/output type T goes last, as these are most likely
the order in which people will want to change things.
(2) Is there a desire to have a separate input and output type?
Alternatively, should I perhaps treat all input and internal processing as using
the real type and just use T for the output?
(3) Given the default template arguments provided, is there any way to
ensure that an instance of the Normal struct can be implemented like this:
auto normal01 = Normal(0, 1);
rather than having to do this:
auto normal01 = Normal!()(0, 1);
Yes, I could create a function that returns an instance, but I'm not sure what
to call it given that I want the normal() function to serve a similar purpose as
uniform() ... :-)
Thanks in advance for any and all feedback.
Best wishes,
-- Joe
import std.conv, std.exception, std.math, std.random, std.stdio, std.traits;
/**
Generates a random floating-point number drawn from a
normal (Gaussian) distribution with specified mean and
standard deviation (sigma).
*/
auto normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
(T mean, T sigma, ref UniformRandomNumberGenerator urng = rndGen)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
static NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) engine;
return normal(mean, sigma, engine, urng);
}
/**
ditto
*/
auto normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
(T mean, T sigma, ref NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) normalEngine, ref UniformRandomNumberGenerator urng = rndGen)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
enforce(0 <= sigma, text("std.random.normal(): standard deviation ", sigma, " is less than zero"));
return sigma * normalEngine(urng) + mean;
}
struct Normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
private T _mean, _sigma;
private NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) _engine;
this(T m, T s)
{
_mean = m;
_sigma = s;
}
@property T mean()
{
return _mean;
}
@property T stddev()
{
return _sigma;
}
T opCall(ref UniformRandomNumberGenerator urng)
{
return normal!(T, NormalRandomNumberEngine, UniformRandomNumberGenerator)(_mean, _sigma, _engine, urng);
}
}
/**
Generates a random floating-point number drawn from a
normal (Gaussian) distribution with mean 0 and standard
deviation (sigma) 1, using the Box-Muller Transform method.
*/
struct NormalBoxMullerEngine(T, UniformRandomNumberGenerator)
if(isFloatingPoint!T)
{
private bool _valid = false;
private T _rho, _r1, _r2;
T opCall(ref UniformRandomNumberGenerator urng)
{
if(_valid)
_valid = false;
else
{
_r1 = uniform!("[)", T, T)(0, 1, urng);
_r2 = uniform!("[)", T, T)(0, 1, urng);
_rho = sqrt(-2 * log((cast(T) 1) - _r2));
_valid = true;
}
return _rho * (_valid ? cos((cast(T) 2) * PI * _r1)
: sin((cast(T) 2) * PI * _r1));
}
}
void main()
{
Mt19937 rng;
// This works
auto normal3_10 = Normal!()(3, 10);
rng.seed(54321);
foreach(i; 0..10)
writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
writeln();
rng.seed(54321);
// This is to test that the internal engine really is statically stored ...
foreach(i; 0..3)
writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
foreach(i; 3..10)
writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
writeln();
rng.seed(54321);
// And now let's confirm that the struct implementation produces same results
foreach(i; 0..10)
writefln("%g", normal3_10(rng));
writeln();
writeln("Mean = ", normal3_10.mean, ", standard deviation = ", normal3_10.stddev);
}
#include <iostream>
#include <boost/random.hpp>
using namespace std;
using namespace boost::random;
int main(void)
{
mt19937 rng;
rng.seed(54321);
normal_distribution<> normal3_10(3, 10);
for(size_t i=0; i < 10; ++i)
cout << normal3_10(rng) << endl;
return 0;
}