On 12/30/19 4:22 PM, Andrew Barnert via Python-ideas wrote:
On Dec 30, 2019, at 06:50, Richard Damon <rich...@damon-family.org> wrote:
On 12/30/19 12:06 AM, Andrew Barnert wrote:
On Dec 29, 2019, at 20:04, Richard Damon <rich...@damon-family.org> wrote:
Thus your total_order, while not REALLY a total order, is likely good enough
for most purposes.
Well, it is a total order of equivalence classes (with all IEEE-equal values
being equivalent, all negative NaNs being equivalent, and all positive NaNs
being equivalent).
But whether it’s good enough for most purposes depends on what those purposes
are.
I’m having a hard time imagining any purposes where “some well-defined but
arbitrary, unnatural, and idiosyncratic total order of equivalence classes of
the binary64 values” is of any use at all. Maybe that’s just a failure of my
imagination. But “the specific well-defined but arbitrary, unnatural order that
almost every other similar bit of code in every language is likely to use”
seems a lot more likely to be useful.
(And, needless to say, so does “this specific order that I defined myself”. I
don’t know why the OP wanted to treat all NaNs as equivalents values greater
than inf, but presumably he does know, and that’s fine. So, I don’t think that
needs to be built into the module as one of the options, but I don’t see why he
shouldn’t be able to specify it explicitly.)
But it DOESN'T make all positive NaNs equivalent, as they do not compare 'equal'.
Neither is less than the other, but they don't compare equal. If you use a sort
function (not the one that Python currently uses) that uses an equality test as part
of its sort (rather than just letting the falsehood of a < b and b< a imply
that a == b) then it will break and perhaps return illogical results. This is getting
into the weeds of corner cases, so may not be important.
Yes it does. If all positive NaNs sort the same with this key function and the
Python sorted function, they form an equivalence class under that function. The
fact that they wouldn’t form an equivalence class under a different function is
irrelevant. It’s like arguing that 1 and 8 can’t be in the same equivalence
class for mod 7 addition because they’re not equivalent under mod 11 addition.
And I don’t even know what you’re trying to argue here. That we shouldn’t allow
this key function because it wouldn’t provide a total order if someone
monkeypatched builtins to replace sorted? And therefore we should invent some
“better” order that nobody has asked for and offer that?
The issue is that the functions generates keys of the form (1, nan) for
NaNs, and two of these do not compare equal, so not an equivalency
class. IF the sort routine only uses <, and assumes that if neither a <
b or b < a then they are equal, but not all sort routine use that
method. I think the current sorted function just uses <, so it works as
a key, it just isn't guaranteed to work for all sort algorithms.
Basically, if a sort ever compares with a Nan using normal comparison
rules, you have opened the door to problems. In this case we will only
ever compare a NaN to another NaN, and depending on the sort routine, it
still might work, and if the algorithm never uses an equality test to
see that they aren't equal (and few sorts actually test for equality) it
won't cause a problem.
One simple solution is to just make the nan tuples be (1, 0) or (-1, 0)
and just make all NaNs a real equivalence class. If you want NaN to sort
by their representation, then you could could make that the second term
of the tuple (as an int)
I will admit that if you want to sort a list that contains a mix of floats,
Decimals, and Rationals, then the tuple approach might be better then the
conversion to a 64 bit int as the key, since that doesn't work for the other
types except by converting them to float first.
Better for what?
If there’s no good use for the OP’s order or the tuple order or IEEE total
order, we shouldn’t try to provide any of them; figuring out which one “makes
sense” or coming up with one that “makes more sense” when we don’t even know
what case we’re trying to make sense of is pointless. Inventing a new order
that nobody asked for and nobody can even imagine a use for doesn’t help
anyone, it’s just more code to write and maintain and one more option to
confuse users with that benefits nobody. It doesn’t matter what properties that
new order has.
If there is a good use for any of them, on the other hand… inventing a
different order that isn’t one any of them asked for still doesn’t help anyone.
We need to provide what they actually asked for, or give them a way to do so
(most obviously, with a key parameter), or just reject their use as unimportant.
Building the 64 bit int key to sort floats by IEEE total_order only
works if all the values are floats, as that key won't compare properly
with other types of numbers. If your data does have other types of
numbers, than making the tuple to force NaNs into their right place, and
let the normal value comparison work for normal numbers (or infinities
which I think get compared right).
As to why NaNs are greater than infinity, that comes from the IEEE
specification on the total_order relationship of floating point numbers.
So what? If you care about IEEE total order, you want IEEE total order, not an
unspecified idiosyncratic order that’s similar to it in some ways.
The question was why were NaNs sorted this way, and the basic answer is
to get a usable order for sorting, you have to put them somewhere, and
the IEEE total order definition is as good as any. Ultimately, no
position would make sense universally. The order that IEEE chose is 1)
a simple order to create (being based fairly simply on the
representation for binary floating point), and 2) putting NaNs at the
extreme may make it easier to remove them if that is what is desired.
The order he generates is very close to the IEEE total order, the
difference are:
1) It doesn't seperate -0 for +0, which shouldn't matter for most
applications.
2) It doesn't provide an order between NaNs, but unless you are taking
special measures to distinguish the NaNs anyway, that doesn't really matter.
--
Richard Damon
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/NWLJBR2ZO4ZJ3FBMKSIN6PHSZXPIV2DF/
Code of Conduct: http://python.org/psf/codeofconduct/