New submission from Mark Bell <mark00b...@googlemail.com>:
The docstring for `random.choices` indicates that ``` import random random.choices(population, k=1) ``` should produce a list containing one item, where each item of `population` has equal likelihood of being selected. However `random.choices` draws elements for its sample by doing `population[floor(random() * len(population)]` and so relies on floating point numbers. Therefore not each item is equally likely to be chosen since floats are not uniformly dense in [0, 1] and this problem becomes worse as `population` becomes larger. Note that this issue does not apply to `random.choice(population)` since this uses `random.randint` to choose a random element of `population` and performs exact integer arithmetic. Compare https://github.com/python/cpython/blob/main/Lib/random.py#L371 and https://github.com/python/cpython/blob/main/Lib/random.py#L490 Could `random.choices` fall back to doing `return [choice(population) for _ in _repeat(None, k)]` if no weights are given? Similarly, is it also plausible to only rely on `random.randint` and integer arithmetic if all of the (cumulative) weights given to `random.choices` are integers? ---------- components: Library (Lib) messages: 415981 nosy: Mark.Bell priority: normal severity: normal status: open title: random.choice and random.choices have different distributions versions: Python 3.11 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue47114> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com