Steven D'Aprano <steve+pyt...@pearwood.info> added the comment:

Rémi. I've read over your patch and have some comments:

(1) You call sorted() to produce a list, but then instead of retrieving the 
item using ``data[i-1]`` you use ``itertools.islice``. That seems unnecessary 
to me. Do you have a reason for using ``islice``?

(2) select is not very useful on its own, we actually want it so we can 
calculate quantiles, e.g. percentiles, deciles, quartiles. If we want the 
k-quantile (e.g. k=100 for percentiles) then there are k+1 k-quantiles in 
total, including the minimum and maximum. E.g quartiles divide the data set 
into four equal sections, so there are five boundary values including the min 
and max.

So the caller is likely to be calling select repeatedly on the same data set, 
and hence making a copy of that data and sorting it repeatedly. If the data set 
is small, repeatedly making sorted copies is still cheap enough, but for large 
data sets, that will be expensive.

Do you have any thoughts on how to deal with that?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35775>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to