New shape samplers have been added to the library to sampler coordinates
from different shapes (see RNG-132 [1]).

I have been working on an idea to combine shape samplers together so that a
more complex shape can be sampled, for example a surface or volume.

This requires that different samplers can be combined as a common sampler.
This is facilitated by adding new interfaces to the library which are the
generic typed version of the current DiscreteSampler (for int) and
ContinuousSampler (for double) and their SharedStateSampler extensions:

public interface ObjectSampler<T> {
    T sample();
}

public interface SharedStateObjectSampler<T> extends
        ObjectSampler<T>,
        SharedStateSampler<SharedStateObjectSampler<T>> {
    // Composite interface
}

All the samplers in the library that create object samples already use the
method name sample() and implement SharedStateSampler. The exception is the
UnitSphereSampler which has a sampling method nextVector(). So adding these
interfaces is a small change to facilitate a composite sampler.

The composite sampler should combine many samplers, each with its own
weight. The weights can be used to create a discrete probability
distribution. We have 3 samplers that can sample efficiently from this:

GuideTableDiscreteSampler
AliasMethodDiscreteSampler
MarsagliaTsangWangDiscreteSampler.Enumerated

So a composite sampler must accept a set of weighted samplers (of the same
type) and create a discrete sampler to select which one to sample. This is
facilitated using a builder API:

S is the type of sampler

public interface Builder<S> {
    int size();
    Builder<S> add(S sampler, double weight);
    Builder<S> setFactory(DiscreteProbabilitySamplerFactory factory);
    // Only works if size > 0
    S build(UniformRandomProvider rng);
}

The factory specifies a mechanism to create the users choice of discrete
sampler:

public interface DiscreteProbabilitySamplerFactory {
    DiscreteSampler create(UniformRandomProvider rng,
                           double[] probabilities);
}

It is not required to be set as a default will exist. The choice for
the DiscreteProbabilityCollectionSampler was the GuideTableDiscreteSampler
due to its low construction overhead (see RNG-109 [2]).

A static class provides a mechanism to create composite samplers via
builders typed to the final sample type:

public final class CompositeSamplers {
    public static <T> Builder<ObjectSampler<T>> newObjectSamplerBuilder();
    public static <T> Builder<SharedStateObjectSampler<T>>
        newSharedStateObjectSamplerBuilder();
    public static Builder<DiscreteSampler> newDiscreteSamplerBuilder();
    public static Builder<SharedStateDiscreteSampler>
        newSharedStateDiscreteSamplerBuilder();
    public static Builder<ContinuousSampler> newContinuousSamplerBuilder();
    public static Builder<SharedStateContinuousSampler>
        newSharedStateContinuousSamplerBuilder();
}

An example of usage would be:

UniformRandomProvider rng = ...;
DiscreteSampler dayOfMonth = CompositeSamplers.newDiscreteSamplerBuilder()
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Jan
    .add(DiscreteUniformSampler.of(rng, 1, 28), 28) // Feb
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Mar
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Apr
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // May
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Jun
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Jul
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Aug
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Sep
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Oct
    .add(DiscreteUniformSampler.of(rng, 1, 30), 30) // Nov
    .add(DiscreteUniformSampler.of(rng, 1, 31), 31) // Dec
    .build(rng);
int day = dayOfMonth.sample();

// Diamond vertices
double[] a = {0, 0};
double[] b = {1, 1};
double[] c = {2, 0};
double[] d = {1, -1};
// Note: The sample type (double[]) must be specified if the builder is not
assigned
ObjectSampler<double[]> diamond =
    CompositeSamplers.<double[]>newObjectSamplerBuilder()
    .add(TriangleSampler.of(a, b, c, rng), 1) // Upper
    .add(TriangleSampler.of(a, d, c, rng), 1) // Lower
    .build(rng);
double[] coord = diamond.sample();

// Note: Type is inferred if the builder is assigned and then used:
Builder<ObjectSampler<double[]>> builder =
CompositeSamplers.newObjectSamplerBuilder();
builder.add(TriangleSampler.of(a, b, c, rng), 1); // Upper
etc.

I have a working version of the above and can create a WIP pull request for
a detailed inspection.

I suggest starting with adding the two new interfaces (ObjectSampler<T> and
SharedStateObjectSampler<T>) and changing the codebase to implement it.
Then adding a composite sampler in a separate change that will require
further discussion.

Note: I had started with the idea of a static factory method:

public static <T> SharedStateObjectSampler<T> of(UniformRandomProvider rng,
                                                 List<? extends
SharedStateObjectSampler<T>> samplers,
                                                 double[] weights) {

This is a similar idea to the factory constructor for
the DiscreteProbabilityCollectionSampler. However to add similar methods
for all the 6 samplers (as above) requires more code. Using the builder API
it can be done with a single generic builder that encapsulates all the
functionality of collecting the samplers and constructing the discrete
probability sampler. It also allows optional arguments such as the method
to control the discrete probability sampler. The static factory method
requires a user to create a list to hold each sampler and then an array of
weights before calling the factory method. In my opinion the builder is
easier to use as the samplers can be added as they are generated.

Alex

[1] https://issues.apache.org/jira/browse/RNG-132
[2] https://issues.apache.org/jira/browse/RNG-109

Reply via email to