Fabien COELHO писал 2021-07-04 23:29:
The important property of determinism is that if I set a seed, and then make an identical set of calls to the random API, the results will be identical every time, so that it's possible to write tests with predictable/repeatable results.

Hmmm… I like my stronger determinism definition more than this one:-)

I would work around that by deriving another 128 bit generator instead of splitmix 64 bit, but that is overkill.

Not really relevant now, but I'm pretty sure that's impossible to do.
You might try it as an interesting academic exercise, but I believe
it's a logical impossibility.

Hmmm… I was simply thinking of seeding a new pg_prng_state from the
main pg_prng_state with some transformation, and then iterate over the
second PRNG, pretty much like I did with splitmix, but with 128 bits
so that your #states argument does not apply, i.e. something like:

 /* select in a range with bitmask rejection */
 uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range)
 {
    /* always advance state once */
    uint64 next = xoroshiro128ss(state);
    uint64 val;

    if (range >= 2)
    {
        uint64 mask = mask_u64(range-1);

        val = next & mask;

        if (val >= range)
        {
            /* copy and update current prng state */
            pg_prng_state iterator = *state;

            iterator.s0 ^= next;
            iterator.s1 += UINT64CONST(0x9E3779B97f4A7C15);

            /* iterate till val in [0, range) */
            while ((val = xoroshiro128ss(&iterator) & mask) >= range)
                ;
        }
    }
    else
        val = 0;

    return val;
 }

The initial pseudo-random sequence is left to proceed, and a new PRNG
is basically forked for iterating on the mask, if needed.

I believe most "range" values are small, much smaller than UINT32_MAX.
In this case, according to [1] fastest method is Lemire's one (I'd take
original version from [2])

Therefore combined method pg_prng_u64_range could branch on range value

uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range)
{
  uint64 val = xoroshiro128ss(state);
  uint64 m;
  if ((range & (range-1) == 0) /* handle all power 2 cases */
    return range != 0 ? val & (range-1) : 0;
  if (likely(range < PG_UINT32_MAX/32)
  {
    /*
* Daniel Lamire's unbiased range random algorithm based on rejection sampling
     * https://lemire.me/blog/2016/06/30/fast-random-shuffling/
     */
    m = (uint32)val * range;
    if ((uint32)m < range)
    {
      uint32 t = -range % range;
      while ((uint32)m < t)
        m = (uint32)xoroshiro128ss(state) * range;
    }
    return m >> 32;
  }
  /* Apple's mask method */
  m = mask_u64(range-1);
  val &= m;
  while (val >= range)
    val = xoroshiro128ss(state) & m;
  return val;
}

Mask method could also be faster when range is close to mask.
For example, fast check for "range is within 1/1024 to mask" is
  range < (range/512 + (range&(range*2)))

And then method choose could like:
if (likely(range < UINT32_MAX/32 && range > (range/512 + (range&(range*2)))))

But I don't know does additional condition worth difference or not.

[1] https://www.pcg-random.org/posts/bounded-rands.html
[2] https://lemire.me/blog/2016/06/30/fast-random-shuffling/

regards,
Sokolov Yura


Reply via email to