On Dec 29, 2019, at 23:50, Steven D'Aprano <st...@pearwood.info> wrote:
> 
> On Sun, Dec 29, 2019 at 06:23:03PM -0800, Andrew Barnert via Python-ideas 
> wrote:
> 
>> Likewise, it’s even easier to write ignore-nan yourself than to write the 
>> DSU yourself:
>> 
>>    median = statistics.median(x for x in xs if not x.isnan())
> 
> Try that with xs = [1, 10**400, 2] and come back to me.

Presumably the end user (unlike the statistics module) knows what data they 
have. You use this code when you’re passing in Decimals where NaN means missing 
values. You don’t use it when you’re passing in integers, or Decimals where NaN 
means an unexpected error, or Decimals where None rather than NaN means missing 
data, or whatever. Any of those cases are trivial. 

Could there be some use case where you know your data is either all ints or all 
Decimals but not which, and NaNs mean missing values? I suppose that’s not 
impossible. In that case, you have to write the appropriate function to filter 
out Decimal NaNs without breaking on ints.

On Dec 29, 2019, at 23:59, Steven D'Aprano <st...@pearwood.info> wrote:
> 
> On Sun, Dec 29, 2019 at 08:32:52PM -0800, Andrew Barnert via Python-ideas 
> wrote:
> 
>> The 95% case is handled by just ignore and raise. Novices should 
>> probably never be using anything else.
>> 
>> Experts will definitely often want poison. And probably sometimes fast 
>> for backward compatibility and/or performance. That gets you to 98%.
>> 
>> Experts will rarely but not never want total order.
> 
> Can you explain the scenario where somebody using median will want 
> negative NANs to sort to the beginning, below -INF, and positive NANs to 
> sort to the end, above +INF?

I don’t know of one. I just assumed that since at least two people asked for it 
on this thread, some people will sometimes want it.

(I could maybe imagine total order being useful for non-NaN cases, to make sure 
I get the same value for [0, -0, -10] in my Python code and my C++ code. But 
nobody’s actually asked for that; they asked for it specifically for NaN.)

>> And experts might also want something different from IEEE total order, 
>> like uniformly pushing all NaNs to the end.
> 
> Likewise. I'm sure there are many uses of sorting NANs to one end, but 
> when will it be useful for median?

Likewise, I assumed that since the original post that started this thread was 
specifically asking for that, at least one person has a use for it. I have no 
idea what that use is.

Maybe both of these requests were spurious. If no custom ordering is ever 
needed, then great; we’re down to raise and ignore, plus 
fast/backward-compatible, plus maybe poison. These can all be implemented 
pretty easily for the four types that median actually supports. And doing them 
all in the obvious way will mean they all happen to work for all types that are 
totally-ordered-except-NaN with a NaN whose semantics are IEEE and that can be 
detected by the same method as Decimal, which is already more than good enough.

But if any custom ordering is ever needed, I can’t see how it helps to make it 
easy to use an order that’s somewhat similar to the two that people have asked 
for but not the same as either. If anyone needs the ordering defined by 
Decimal.compare_total, they probably know that’s the function they need, and 
they ought to know how to pass a key function. So, rather than trying to add a 
fifth option that gives them something they don’t want, just add a key 
parameter.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RGSHWKWJYX3SKRZMQZN7NZ6UKBESDCB7/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to