[Python-ideas] Re: NAN handling in statistics functions

Brendan Barnwell Thu, 26 Aug 2021 12:06:44 -0700

On 2021-08-23 20:53, Steven D'Aprano wrote:

So I propose that statistics functions gain a keyword only parameter to
specify the desired behaviour when a NAN is found:


- raise an exception

- return NAN

- ignore it (filter out NANs)

which seem to be the three most common preference. (It seems to be
split roughly equally between the three.)

Thoughts? Objections?

I agree that these are the three options that should be availablebecause they're the most commonly used ones in other tools that handleNANs (like numpy and pandas).

Does anyone have any strong feelings about what should be the default?

I'm conflicted. The NAN-aware tool I use most is Pandas, which for themost part handles nans by filtering them out, and this is very handy.But that's partly because Pandas has a lot of NAN-awareness built in(making it easy to, for instance, fill in NANs with some default orimputed value).

I think I'd lean toward "return NAN" as the best default, as it seemsmost consistent with how NAN works in ordinary mathematical expressions(e.g., `2 + nan`).

One important thing we should think about is whether to add similarhandling to `max` and `min`. These are builtin functions, not in thestatistics module, but they have similarly confusing behavior with NAN:compare `max(1, 2, float('nan'))` with `max(float('nan'), 1, 2)`. Aslong as we're handling this for median and so on, it would be nice tohave the ability to do NAN-aware max and min as well.


--
Brendan Barnwell

"Do not follow where the path may lead. Go, instead, where there is nopath, and leave a trail."

   --author unknown
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZSPIO2YAVUXPZM7W7OHQDHZITQ4ZNO2H/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: NAN handling in statistics functions

Reply via email to