Hello all,

A request for feedback on the design of this code before I put it through some serious testing and submit a formal pull request to Phobos' std.random.

The goal here is to provide an interface that is close to the Boost/C++11 design while allowing for some extra interesting possibilities. In particular the goal is to allow the possibility of different underlying generation algorithms or "engines" that can serve different requirements regarding speed, precision, memory consumption etc. See:
http://www.cse.cuhk.edu.hk/%7Ephwl/mt/public/archives/papers/grng_acmcs07.pdf

... for further details.

The present code contains both a function and struct interface for normal random number generation; the latter stores the parameters of the distribution (mean and standard deviation) and an instance of the underlying engine, but not the random number generator.

(An alternative design choice for the struct would have been to design it as a range with front/popFront, but for this to work I'd have had to also have it store the RNG, and this would run into the same problems as RandomSample has due to RNGs not being reference types. Boost/C++11 also do not store the RNG internally but implement a variate_generator class which allows the coupling of a random-number distribution and an underlying generator; this is the approach I'd take for defining say a RandomVariate range.)

I've so far implemented one underlying engine, the Box-Muller method currently used in Boost. This is not the best engine possible -- an implementation of the Ziggurat algorithm is desirable -- but it represents what is typically used in other libraries. The essence of Box-Muller is that it generates (uniformly-distributed) numbers a pair at a time and then uses those to generate a corresponding pair of normally-distributed numbers. The method is described here:
https://en.wikipedia.org/wiki/Box-Muller_transform

I've stuck very closely to the implementation in Boost, to the point of reproducing a "trick" Boost uses to get round the fact that its uniform random number generator only supports the interval [a, b) whereas the canonical description of Box-Muller requires the random numbers to be distributed in the interval (0, 1]. Obviously in D this is avoidable, but I felt there was some advantage in reproducing identical results to Boost, at least within the limits of floating-point numerical rounding (and maybe also because D's PI variable goes to a greater number of decimal places).

From a licensing point of view this close reproduction isn't an issue, as Boost and Phobos both use the same licence, but there'll be a credit to the original Boost authors in any pull request. However, if there's a desire to differentiate the code more strongly, that can be arranged :-) I've attached a small C++ code example to demonstrate the common output for the D version and that of Boost.

Anyway, what I'd really appreciate feedback on is the interface design for these functions and structs. A few concrete questions:

(1) Does the order of template arguments seem optimal? My inclination is perhaps to tweak it so that the Uniform RNG type goes first, the underlying engine second, and the input/output type T goes last, as these are most likely the order in which people will want to change things.

(2) Is there a desire to have a separate input and output type? Alternatively, should I perhaps treat all input and internal processing as using the real type and just use T for the output?

(3) Given the default template arguments provided, is there any way to ensure that an instance of the Normal struct can be implemented like this:

     auto normal01 = Normal(0, 1);

rather than having to do this:

     auto normal01 = Normal!()(0, 1);

Yes, I could create a function that returns an instance, but I'm not sure what to call it given that I want the normal() function to serve a similar purpose as uniform() ... :-)

Thanks in advance for any and all feedback.

Best wishes,

     -- Joe
import std.conv, std.exception, std.math, std.random, std.stdio, std.traits;

/**
Generates a random floating-point number drawn from a
normal (Gaussian) distribution with specified mean and
standard deviation (sigma).
*/
auto normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
(T mean, T sigma, ref UniformRandomNumberGenerator urng = rndGen)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
    static NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) engine;
    return normal(mean, sigma, engine, urng);
}

/**
ditto
*/
auto normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
(T mean, T sigma, ref NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) normalEngine, ref UniformRandomNumberGenerator urng = rndGen)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
    enforce(0 <= sigma, text("std.random.normal(): standard deviation ", sigma, " is less than zero"));
    return sigma * normalEngine(urng) + mean;
}

struct Normal(T = real, alias NormalRandomNumberEngine = NormalBoxMullerEngine, UniformRandomNumberGenerator = Random)
if (isFloatingPoint!T && isUniformRNG!UniformRandomNumberGenerator)
{
    private T _mean, _sigma;
    private NormalRandomNumberEngine!(T, UniformRandomNumberGenerator) _engine;

    this(T m, T s)
    {
        _mean = m;
        _sigma = s;
    }

    @property T mean()
    {
        return _mean;
    }

    @property T stddev()
    {
        return _sigma;
    }

    T opCall(ref UniformRandomNumberGenerator urng)
    {
        return normal!(T, NormalRandomNumberEngine, UniformRandomNumberGenerator)(_mean, _sigma, _engine, urng);
    }
}

/**
Generates a random floating-point number drawn from a
normal (Gaussian) distribution with mean 0 and standard
deviation (sigma) 1, using the Box-Muller Transform method.
*/
struct NormalBoxMullerEngine(T, UniformRandomNumberGenerator)
if(isFloatingPoint!T)
{
    private bool _valid = false;
    private T _rho, _r1, _r2;

    T opCall(ref UniformRandomNumberGenerator urng)
    {
        if(_valid)
            _valid = false;
        else
        {
            _r1 = uniform!("[)", T, T)(0, 1, urng);
            _r2 = uniform!("[)", T, T)(0, 1, urng);
            _rho = sqrt(-2 * log((cast(T) 1) - _r2));
            _valid = true;
        }

        return _rho * (_valid ? cos((cast(T) 2) * PI * _r1)
                              : sin((cast(T) 2) * PI * _r1));
    }
}



void main()
{
    Mt19937 rng;
    // This works
    auto normal3_10 = Normal!()(3, 10);

    rng.seed(54321);

    foreach(i; 0..10)
        writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
    writeln();

    rng.seed(54321);

    // This is to test that the internal engine really is statically stored ...
    foreach(i; 0..3)
        writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
    foreach(i; 3..10)
        writefln("%g", normal(cast(real) 3, cast(real) 10, rng));
    writeln();

    rng.seed(54321);

    // And now let's confirm that the struct implementation produces same results
    foreach(i; 0..10)
        writefln("%g", normal3_10(rng));
    writeln();

    writeln("Mean = ", normal3_10.mean, ", standard deviation = ", normal3_10.stddev);
}
#include <iostream>
#include <boost/random.hpp>

using namespace std;
using namespace boost::random;

int main(void)
{
	mt19937 rng;
	rng.seed(54321);
	normal_distribution<> normal3_10(3, 10);

	for(size_t i=0; i < 10; ++i)
		cout << normal3_10(rng) << endl;

	return 0;
}

Reply via email to