Hi.
Le sam. 4 mai 2019 à 21:31, Alex Herbert <[email protected]> a écrit :
>
>
>
> > On 4 May 2019, at 14:46, Gilles Sadowski <[email protected]> wrote:
> >
> > Hello.
> >
> > Le ven. 3 mai 2019 à 16:57, Alex Herbert <[email protected]
> > <mailto:[email protected]>> a écrit :
> >>
> >> Most of the samplers in the library have very small states that are easy
> >> to compute. Some have computations that are more expensive, such as the
> >> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
> >>
> >> However once the state is computed the only part of the state that
> >> changes is the RNG. I would like to suggest a way to copy samplers as
> >> something like:
> >>
> >> DiscreteSampler newInstance(UniformRandomProvider)
> >>
> >> The new instance would share all the private state of the first sampler
> >> except the RNG. This can be used for multi-threaded applications which
> >> require a new sampler per thread but sample from the same distribution.
> >>
> >> A particular case in point is the as yet not integrated
> >> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
> >> "large" state [2] that takes a "long" time [3] to compute but is
> >> effectively immutable. This could be shared across instances saving
> >> memory for parallel application.
> >>
> >> A copy instance would be almost zero set-up time and provide opportunity
> >> for caching of commonly used samplers.
> >
> > The goal is sharing (immutable) state so it seems that the semantics is
> > not "copy".
> >
> > Isn't it a "factory" that we are after? E.g. something like:
> > public final class CachedSamplingFactory {
> > private static PoissonSamplerCache poisson = new PoissonSamplerCache();
> >
> > public PoissonSampler createPoissonSampler(UniformRandomProvider
> > rng, double mean) {
> > if (!poisson.isCached(mean)) {
> > poisson.createCache(mean); // Initialize (requires
> > synchronization) ...
> > }
> > return new PoissonSampler(poisson.getCache(mean), rng); //
> > Construct using pre-built state.
> > }
> > }
> > [It may be overkill, more work, and less performant…]
>
> But you need a factory for every class you want to share state for. And the
> factory actually has to look in a cache. If you operate on an instance then
> you get what you want. Another working version of the same sampler. It would
> also be thread safe without synchronisation as long as the state is
> immutable. The only mutable state is the passed in RNG.
Agreed. It was what I meant by the last sentence.
> >
> > IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface
> > (?).
>
> I did think of extending DiscreteSampler with this functionality. Not adding
> to the interface as it currently is ‘functional’ as it has only one method. I
> think that should not change. Having thought about it a bit more I like the
> idea of a new functional interface. Perhaps:
>
> interface DiscreteSamplerProvider {
> DiscreteSampler create(UniformRandomProvider rng);
> }
>
> Substitute ‘Provider’ for:
>
> - Generator
> - Supplier (possible clash or alignment with Java 8 depending on the way it
> is done)
> - Factory (though the method is not static so I do not like this)
> - etc
>
> So this then becomes a functional interface that can be used by anything.
> However instances of a sampler would be expected to return a sampler matching
> their own functionality.
>
> Note there are some samplers not implementing an interface that also could
> benefit from this. Namely CollectionSampler and
> DiscreteProbabilityCollectionSampler. So does this need a generic interface:
>
> Sampler<T> {
> T sample();
> }
>
> To be complimented with:
>
> SamplerProvider<T> {
> Sampler<T> create(UniformRandomProvider rng);
> }
>
> So the library would require:
>
> SamplerProvider<T>
> DiscreteSamplerProvider
> ContinuousSamplerProvider
>
> Any sampler can choose to implement being a Provider. There are some cases
> where it is mute. For example a ZigguratNormalizedGaussianSampler just stores
> the rng in the constructor. However it could still be a Provider just the
> method would only call the constructor. It would allow writing a generic
> multi-threaded application that just uses e.g. a DiscreteSamplerProvider to
> create samplers for each thread. You can then drop in the actual
> implementation you require. For example you could swap the type of
> PoissonSampler in your simulation by swapping the provider for the Poisson
> distribution.
>
> How does that sound?
Fine to have
DiscreteSamplerProvider
ContinuousSamplerProvider
[Perhaps the "Supplier" suffix would be better to avoid confusion with
"UniformRandomProvider".]
At first sight, I don't think that the generic interface would have
any actual use since, ultimately, the return value of "sample()"
will be either "int" or "double" (no polymorphism).
Gilles
>
> Alex
>
>
>
> > I'm a bit wary that this would compound two different functionalities:
> > * data generator (method "sample"),
> > * generator generator (method "newInstance").
> > [But I currently don't have an example where this would be a problem.]
> >
> > Regards,
> > Gilles
> >
> >> Alex
> >>
> >> [1] https://issues.apache.org/jira/browse/RNG-91
> >> <https://issues.apache.org/jira/browse/RNG-91>
> >>
> >> [2] kB, or possibly MB, of tabulated data
> >>
> >> [3] Set-up cost for a Poisson sampler is in the order of 30 to 165 times
> >> as long as a SmallMeanPoissonSampler for a mean of 2 and 32. Note
> >> however that construction still takes only 1.1 and 4.5 microseconds for
> >> the "long" time.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]