Yuval S <sadan.yu...@gmail.com> added the comment:
Thank you for the attention and the quick fix. However, the current documentation for "Notes on Reproducibility" should still address this issue of hash randomization. Not only `sample` is affected by this, but any code that combines strings (or bytes or datetime) with hash and random, e.g. >>> import random >>> random.seed(6) >>> a = list(set(str(i) for i in range(500))) >>> print(a[int(random.random() * 500)]) or, this >>> import random >>> import datetime >>> random.seed(6) >>> print(random.choice(range(hash(datetime.datetime(2000,1,1)) % 100))) will still produce non-reproducible results even after the fix. Here is my suggestion for documentation: > Hash randomization, which is enabled by default since version 3.3, is not > affected by `random.seed()`. For this reason, code that relies on string > hashes, such as code that relies on the ordering of `set` or `dict`, might be > non-reproducible, unless string hash randomization is disabled or seeded > (see: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED). My vote would be to keep hash randomization ties to `random.seed()`, and this would make all use cases more predictable, as well as allow `random.sample()` to support `set`. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40325> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com