Hi. Le sam. 4 mai 2019 à 21:31, Alex Herbert <alex.d.herb...@gmail.com> a écrit : > > > > > On 4 May 2019, at 14:46, Gilles Sadowski <gillese...@gmail.com> wrote: > > > > Hello. > > > > Le ven. 3 mai 2019 à 16:57, Alex Herbert <alex.d.herb...@gmail.com > > <mailto:alex.d.herb...@gmail.com>> a écrit : > >> > >> Most of the samplers in the library have very small states that are easy > >> to compute. Some have computations that are more expensive, such as the > >> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler. > >> > >> However once the state is computed the only part of the state that > >> changes is the RNG. I would like to suggest a way to copy samplers as > >> something like: > >> > >> DiscreteSampler newInstance(UniformRandomProvider) > >> > >> The new instance would share all the private state of the first sampler > >> except the RNG. This can be used for multi-threaded applications which > >> require a new sampler per thread but sample from the same distribution. > >> > >> A particular case in point is the as yet not integrated > >> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a > >> "large" state [2] that takes a "long" time [3] to compute but is > >> effectively immutable. This could be shared across instances saving > >> memory for parallel application. > >> > >> A copy instance would be almost zero set-up time and provide opportunity > >> for caching of commonly used samplers. > > > > The goal is sharing (immutable) state so it seems that the semantics is > > not "copy". > > > > Isn't it a "factory" that we are after? E.g. something like: > > public final class CachedSamplingFactory { > > private static PoissonSamplerCache poisson = new PoissonSamplerCache(); > > > > public PoissonSampler createPoissonSampler(UniformRandomProvider > > rng, double mean) { > > if (!poisson.isCached(mean)) { > > poisson.createCache(mean); // Initialize (requires > > synchronization) ... > > } > > return new PoissonSampler(poisson.getCache(mean), rng); // > > Construct using pre-built state. > > } > > } > > [It may be overkill, more work, and less performant…] > > But you need a factory for every class you want to share state for. And the > factory actually has to look in a cache. If you operate on an instance then > you get what you want. Another working version of the same sampler. It would > also be thread safe without synchronisation as long as the state is > immutable. The only mutable state is the passed in RNG.
Agreed. It was what I meant by the last sentence. > > > > IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface > > (?). > > I did think of extending DiscreteSampler with this functionality. Not adding > to the interface as it currently is ‘functional’ as it has only one method. I > think that should not change. Having thought about it a bit more I like the > idea of a new functional interface. Perhaps: > > interface DiscreteSamplerProvider { > DiscreteSampler create(UniformRandomProvider rng); > } > > Substitute ‘Provider’ for: > > - Generator > - Supplier (possible clash or alignment with Java 8 depending on the way it > is done) > - Factory (though the method is not static so I do not like this) > - etc > > So this then becomes a functional interface that can be used by anything. > However instances of a sampler would be expected to return a sampler matching > their own functionality. > > Note there are some samplers not implementing an interface that also could > benefit from this. Namely CollectionSampler and > DiscreteProbabilityCollectionSampler. So does this need a generic interface: > > Sampler<T> { > T sample(); > } > > To be complimented with: > > SamplerProvider<T> { > Sampler<T> create(UniformRandomProvider rng); > } > > So the library would require: > > SamplerProvider<T> > DiscreteSamplerProvider > ContinuousSamplerProvider > > Any sampler can choose to implement being a Provider. There are some cases > where it is mute. For example a ZigguratNormalizedGaussianSampler just stores > the rng in the constructor. However it could still be a Provider just the > method would only call the constructor. It would allow writing a generic > multi-threaded application that just uses e.g. a DiscreteSamplerProvider to > create samplers for each thread. You can then drop in the actual > implementation you require. For example you could swap the type of > PoissonSampler in your simulation by swapping the provider for the Poisson > distribution. > > How does that sound? Fine to have DiscreteSamplerProvider ContinuousSamplerProvider [Perhaps the "Supplier" suffix would be better to avoid confusion with "UniformRandomProvider".] At first sight, I don't think that the generic interface would have any actual use since, ultimately, the return value of "sample()" will be either "int" or "double" (no polymorphism). Gilles > > Alex > > > > > I'm a bit wary that this would compound two different functionalities: > > * data generator (method "sample"), > > * generator generator (method "newInstance"). > > [But I currently don't have an example where this would be a problem.] > > > > Regards, > > Gilles > > > >> Alex > >> > >> [1] https://issues.apache.org/jira/browse/RNG-91 > >> <https://issues.apache.org/jira/browse/RNG-91> > >> > >> [2] kB, or possibly MB, of tabulated data > >> > >> [3] Set-up cost for a Poisson sampler is in the order of 30 to 165 times > >> as long as a SmallMeanPoissonSampler for a mean of 2 and 32. Note > >> however that construction still takes only 1.1 and 4.5 microseconds for > >> the "long" time. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org