Status: New
Owner: ----
Labels: Type-Defect Priority-Low
New issue 3927 by dana.jac...@gmail.com: randprime is not uniform
http://code.google.com/p/sympy/issues/detail?id=3927
randprime uses the PRIMEINC method for generating random primes, which
output primes with a probability related to their primegap size. Simple
example:
{
from sympy import randprime
from collections import Counter
# Note that 37, 53, and 59 are selected 3x more often than 43
Counter([randprime(2**5,2**6-1) for i in xrange(1000000)])
Counter({37: 183002, 53: 182149, 59: 181035, 61: 151219, 47: 121250, 41:
120884, 43: 60461})
# this is a more obnoxious example because I've looked at a prime gap
table. See how 31469 gets selected over 30x more often than some of the
twins.
Counter([randprime(31300,31907) for i in xrange(1000000)])
Counter({31469: 118072, 31847: 49291, 31891: 40960, ..., 31729: 3248,
31723: 3244, 31321: 3182})
}
Fortunately the code has guards to keep out of range primes from being
output, so it isn't all bad. See the paper Pierre-Alain Fouque and Mehdi
Tibouchi, "Close to Uniform Prime Number Generation With Fewer Random
Bits", 2011. [http://eprint.iacr.org/2011/481] for a discussion of
distributions. Their method looks nice, but is specific to generating a
power of 2 range.
An alternate implementation with uniform distribution but slower would be
to (1) bring the ranges in to low = next_prime(low-1), high =
prev_prime(high+1). Check that low <= high (if not, there is no prime in
the range). Now repeat selecting a random number in the interval and
return it if it is prime. For example right after the map:
{
a = nextprime(a-1)
b = prevprime(b+1)
if a > b:
raise ValueError("no primes exist in the specified range")
while True:
n = random.randint(a, b)
if isprime(n):
return n
}
This gives a uniform distribution (see either of the examples above), but
it is 2-4x slower (having PR 3809 helps as it speeds up isprime, nextprime,
and prevprime). I'd like to see the above perhaps have a loop limit just
in case there's something screwy with randint.
For small inputs, another way that works well and can be faster, is to
select the nth prime in the range primecount(2,low), primecount(low,high).
This relies on having really fast ranged primecount and nth_prime for small
values.
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
You received this message because you are subscribed to the Google Groups
"sympy-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sympy-issues+unsubscr...@googlegroups.com.
To post to this group, send email to sympy-issues@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy-issues.
For more options, visit https://groups.google.com/groups/opt_out.