On Jul 15, 2013, at 4:13 PM, Martin Buchholz <marti...@google.com> wrote: > > On Mon, Jul 15, 2013 at 9:59 AM, Martin Buchholz <marti...@google.com>wrote: >> >> Also, if gamma is much less than 2^64 (say 2^56; that number of gammas >> "should be enough for anybody"), then the above will be a little more >> efficient since the wraparound code will be rare and well-predicted. The >> bits that become available as a result can then be optionally donated to >> the seed space!
I just looked into this. It's not a good idea to generate 64-bit gamma values and then filter them; it's too costly to do rejection sampling, and if you just throw away bits, you risk getting duplicate gamma values, which is undesirable. So instead I took the latest version and modified the generation of gamma values to use a 56-bit GAMMA_GAMMA value and perform arithmetic modulo 2^56-5 (I will call this prime "Arthur"). I also threw together a 56-bit mixing function "mix56" by simply adding a little bit of masking to mix64; I emphasize, however, that I have not yet performed avalanche testing on mix56. The point of mix56 is that it is (I think) a bijective function of 56-bit values, whereas applying mix64 and then discarding 8 bits would not be. Therefore all the generated gamma values should be different. private static final long GAMMA_PRIME = (1L << 56) - 5; // Arthur, a prime number private static final long GAMMA_GAMMA = 0x00281E2DBA6606F3L; // must be less than Arthur private GreatestRandom(long seed, long splitSeed) { this.seed = seed; long s = (splitSeed & 0x7FFFFFFFFFFFFFFFL) % GAMMA_PRIME, g; do { // ensure gamma >= 13, considered as an unsigned integer s += GAMMA_GAMMA; if (s >= GAMMA_PRIME) s -= GAMMA_PRIME; g = mix56(s); } while (g >= 0L && g < 13L); this.gamma = g; this.nextSplit = s; } private static long mix56(long z) { z ^= (z >>> 33); z *= 0xff51afd7ed558ccdL; z &= 0x00FFFFFFFFFFFFFFL; z ^= (z >>> 33); z *= 0xc4ceb9fe1a85ec53L; z &= 0x00FFFFFFFFFFFFFFL; z ^= (z >>> 33); return z; } So here's the punchline: this change makes my Monte Carlo calculation of pi run over 19% faster! Hard to believe. It's certainly avoiding most of the hard work in addGammaModGeorge, and perhaps also making branch prediction totally win, but 19% ??? Someone please verify this independently. Still to be done: (a) verify avalanche behavior of mix56; (b) run more DieHarder tests. --Guy