On Sun, Dec 29, 2019 at 11:33 PM Andrew Barnert <abarn...@yahoo.com> wrote:
> IEEE total order specifies a distinct order for every distinct bit > pattern, and tries to do so in a way that makes sense. > Ok, ok... I've got "learned up" about this three times now :-). Given we cannot control those bit patterns from Python, I'm a bit "meh"... but I get the rule (yeah, yeah, struct module) > The 95% case is handled by just ignore and raise. Novices should probably > never be using anything else. > Experts will definitely often want poison. And probably sometimes fast for > backward compatibility and/or performance. That gets you to 98%. > Fair enough. I really only care about the 98% case. But if you can convince Steven to add `key=` as well, no real harm to me. My only concern is a beginner who types `help(median)` and scratches her head over the key oddness. But I guess the docstring can say "Don't worry about this if you don't need a custom sort order for your objects." It's also no real extra work to pass along a `key` argument to the `sorted()` internal to the function. I guess on the off chance the implementation moves to Quickselect it will be slightly more work. But I guess really not that much even then (hmmm... I think the implementation would have to contain a kind of DSU inside it though for that). Do remember that using `sorted()` is an implementation detail, not a promise of functions in statistics module. And experts might also want something different from IEEE total order, like > uniformly pushing all NaNs to the end. I’m not sure when you’d actually > want that, but since it was the original suggestion that kicked off this > whole discussion, it’s obviously not inconceivable. > I think that idea of "NaNs to the end" was just ill-conceived nonsense. I mean MAYBE I can see a good in putting the +/-nans to both ends of the order, so MAYBE the stuff in the middle winds up being in median. But if you have 100 nans and 50 real numbers, it seems just silly to automatically select NaN as the median. Especially under the "missing data" use that is so common in data science (Pandas, R, etc). > I get the idea that, once you’ve already got an on_nan param, adding > another value to that param doesn’t add as much cognitive load as adding a > whole other param would. But I think a total order value is so rarely > useful that it’s probably more load than it’s worth, while a key param is a > more widely useful and therefore worth more load (although maybe still not > enough). > Oh... yeah, the 'ieee_total_order' value was absolutely silly and over specialized. I just threw it in because some folks in the thread mentioned it. Albeit, it's a value to rarely use for the one parameter, so that's a little less burden than another parameter. In Pandas we see that a lot... some parameter will have 20 options, but 95% of the users use the default, and 4.9% use one non-default option. So the remaining 18 options cover the 0.1% use cases. Yours, David... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6VBRXYNH37JG6Q6GMMZQTFLKPGANOBCR/ Code of Conduct: http://python.org/psf/codeofconduct/