On Dec 29, 2019, at 23:50, Steven D'Aprano <st...@pearwood.info> wrote: > > On Sun, Dec 29, 2019 at 06:23:03PM -0800, Andrew Barnert via Python-ideas > wrote: > >> Likewise, it’s even easier to write ignore-nan yourself than to write the >> DSU yourself: >> >> median = statistics.median(x for x in xs if not x.isnan()) > > Try that with xs = [1, 10**400, 2] and come back to me.
Presumably the end user (unlike the statistics module) knows what data they have. You use this code when you’re passing in Decimals where NaN means missing values. You don’t use it when you’re passing in integers, or Decimals where NaN means an unexpected error, or Decimals where None rather than NaN means missing data, or whatever. Any of those cases are trivial. Could there be some use case where you know your data is either all ints or all Decimals but not which, and NaNs mean missing values? I suppose that’s not impossible. In that case, you have to write the appropriate function to filter out Decimal NaNs without breaking on ints. On Dec 29, 2019, at 23:59, Steven D'Aprano <st...@pearwood.info> wrote: > > On Sun, Dec 29, 2019 at 08:32:52PM -0800, Andrew Barnert via Python-ideas > wrote: > >> The 95% case is handled by just ignore and raise. Novices should >> probably never be using anything else. >> >> Experts will definitely often want poison. And probably sometimes fast >> for backward compatibility and/or performance. That gets you to 98%. >> >> Experts will rarely but not never want total order. > > Can you explain the scenario where somebody using median will want > negative NANs to sort to the beginning, below -INF, and positive NANs to > sort to the end, above +INF? I don’t know of one. I just assumed that since at least two people asked for it on this thread, some people will sometimes want it. (I could maybe imagine total order being useful for non-NaN cases, to make sure I get the same value for [0, -0, -10] in my Python code and my C++ code. But nobody’s actually asked for that; they asked for it specifically for NaN.) >> And experts might also want something different from IEEE total order, >> like uniformly pushing all NaNs to the end. > > Likewise. I'm sure there are many uses of sorting NANs to one end, but > when will it be useful for median? Likewise, I assumed that since the original post that started this thread was specifically asking for that, at least one person has a use for it. I have no idea what that use is. Maybe both of these requests were spurious. If no custom ordering is ever needed, then great; we’re down to raise and ignore, plus fast/backward-compatible, plus maybe poison. These can all be implemented pretty easily for the four types that median actually supports. And doing them all in the obvious way will mean they all happen to work for all types that are totally-ordered-except-NaN with a NaN whose semantics are IEEE and that can be detected by the same method as Decimal, which is already more than good enough. But if any custom ordering is ever needed, I can’t see how it helps to make it easy to use an order that’s somewhat similar to the two that people have asked for but not the same as either. If anyone needs the ordering defined by Decimal.compare_total, they probably know that’s the function they need, and they ought to know how to pass a key function. So, rather than trying to add a fifth option that gives them something they don’t want, just add a key parameter. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RGSHWKWJYX3SKRZMQZN7NZ6UKBESDCB7/ Code of Conduct: http://python.org/psf/codeofconduct/