[Python-ideas] Re: Fix statistics.median()?

Richard Damon Thu, 26 Dec 2019 13:13:33 -0800

On 12/26/19 3:14 PM, David Mertz wrote:

Maybe we can just change the function signature:
statistics.median(it, do_wrong_ass_thing_with_nans=False)

:-)
But yes, the problem is really with sorted(). However, theimplementation of statistics.median() doesn't HAVE TO use sorted(),that's just one convenient way to do it.

Yes, median could do the sort some other way, and in fact the code formedian makes a comment to investigate doing it some other way. The factthat median doesn't actually need the full list sorted, says that

There IS NO right answer for `sorted([nan, 1, 2, 3])`. However, thereis a very plausibly right answer for `statistics.median([nan, 1, 2,3])` ... or rather, both 'nan' and '2' are plausible (one approach iswhat Numpy does, the other is what Pandas does).

Other possible answers would be 1.5 or 2.5 (if the sorting method endedup putting NaNs at the bottom or top of the order, based on thedefinition of the median as the value which half the values are greaterthan (or less than) it. In one sense if there isn't *A* right answer, NOanswer is right.

As was pointed out, the statistics module specifically doesn't claim toreplace more powerful packages, like Numpy, so expecting it to handlethis level of nuance is beyond its specification.


--
Richard Damon
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UTXEYLXLCGNUQR2XPF3QKMJUZA3UIGJT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Fix statistics.median()?

Reply via email to