New submission from Christopher Gurnee:
Due to an optimization in random.randrange() only in Python 2, as the
stop-start range approaches 2^53 the output becomes noticeably biased. This bug
also affects random.SystemRandom.
For example, counting the number of even ints in a set of 10^6 random ints each
in the range [0, 2^52 * 2/3) produces normal results; about half are even:
>>> sum(randrange(2**52 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
0.499932
Change the range to [0, 2^53 * 2/3), and you get a degenerate case where evens
are two times more likely to occur than odds:
>>> sum(randrange(2**53 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
0.333339
The issue occurs in three places inside randrange(), here's one:
if istart >= _maxwidth:
return self._randbelow(istart)
return _int(self.random() * istart)
_maxwidth is the max size of a double where every digit to the left of the
decimal point can still be represented w/o loss of precision (2^53, where a
double has 53 mantissa bits). With istart >= _maxwidth, _randbelow() behaves
correctly. With istart < _maxwidth, the rounding error in random() * istart
begins to cause problems as istart approaches 2^53.
Changing _maxwidth to be significantly less should (practically speaking
anyways) fix this, although I'm open to suggestion on just how conservatively
it should be set.
----------
components: Library (Lib)
messages: 241261
nosy: gurnec
priority: normal
severity: normal
status: open
title: random.randrange() biased output
versions: Python 2.7
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue23974>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com