[issue37682] random.sample should support iterators

2019-07-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: Thomas, thank you for the suggestion but I think we should decline. The API for sample() is consistent with choice(), choices() and shuffle(). For the most part the sequence based API has worked out well. -- resolution: -> rejected stage: ->

[issue37682] random.sample should support iterators

2019-07-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > ISTM that if a generator produces so much data that it is infeasible to fit > in memory, then it will also take a long time to loop over it and generate a > random value for each entry. Good point! $ ./python -m timeit -s 'from random import sample as

[issue37682] random.sample should support iterators

2019-07-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: ISTM that if a generator produces so much data that it is infeasible to fit in memory, then it will also take a long time to loop over it and generate a random value for each entry. FWIW, every time we've looked at reservoir sampling it has been less

[issue37682] random.sample should support iterators

2019-07-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Possible implementation: from itertools import islice as _islice def reservoir_sample(self, population, k): if k < 0: raise ValueError("Sample is negative") it = iter(population) result = list(_islice(it, k)) if len(result) < k:

[issue37682] random.sample should support iterators

2019-07-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: FYI, random.sample() (as most of other functions in the random module) is implemented in pure Python. -- nosy: +serhiy.storchaka ___ Python tracker

[issue37682] random.sample should support iterators

2019-07-25 Thread mental
Change by mental : -- nosy: +mark.dickinson, rhettinger ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue37682] random.sample should support iterators

2019-07-25 Thread Thomas Dybdahl Ahle
New submission from Thomas Dybdahl Ahle : Given a generator `f()` we can use `random.sample(list(f()), 10)` to get a uniform sample of the values generated. This is fine, and fast, as long as `list(f())` easily fits in memory. However, if it doesn't, one has to implement the reservoir sampling