On Dec 26, 2019, at 12:19, Marco Sulla via Python-ideas 
<python-ideas@python.org> wrote:
> 
> IMHO, another sorted function, slower than it, should be added.

You can very easily just write a key function that does this for you. In fact, 
you can write different key functions for different variations.

For example, if you’re specifically sorting floats and want to shift nans to 
the right (even beyond inf):

    class shift_nan_to_end_key(float):
        def __lt__(self, other):
            return math.isnan(other) or not math.isnan(self) and 
float(self)<float(other)

If you want to sort nans to the start, or deal with some other type with 
incomparable values, or do something more general with all pairs of 
incomparable values, you can just write a different key function. Whatever’s 
most appropriate for readability or type safety or performance for your 
particular use case, do that.

In some cases, it might be more readable to write as a cmp function instead of 
a key function, but that’s fine too. To show an example (although maybe not a 
compelling one), you can get the behavior of your algorithm below:

    @functools.cmp_to_key
    def flip_incomparables_key(a, b):
        if a < b: return -1
        if b < a: return 1
        return 1

This treats incomparable values (and equal values) as greater instead of as 
equal, which is what you described… although that doesn’t actually accomplish 
what you wanted.

If you don’t want to have to specify the key function every time, you can write 
your own sorted_slow:

    sorted_slow_nan = functools.partial(sorted, key=shift_nan_to_end_key)

Or, if you want to take and wrap existing key functions, that’s almost as 
trivial.

If any of these are common enough, they could be added to the Sorting HOWTO 
docs, or even included somewhere in the stdlib so you can just import and use, 
or even I suppose added to builtins (although I doubt any of them are nearly 
common enough for that last one).

> I don't know exactly how timsort works, but I know that it requires only 
> `__lt__()`.
> 
> So probably it checks, element by element, if `elem_b < elem_a`, and probably 
> exchange them if this is `1`. If `0`, it does nothing.

No, it’s not that simple. It couldn’t be that simple unless it was a quadratic 
time algorithm. Which means your change won’t actually work.

> I think that a `sorted_slow()` function should also check, if `elem_a < 
> elem_b == 0`, what `elem_a < elem_b` gives.

Presumably you mean `elem_b < elem_a` here?

But at any rate once you discover that the two elements are incomparable, you 
need to decide what to do about them. If you treat them as equal, you get the 
same behavior as currently. If you do the opposite of that, you get… well, it’s 
complicated, but it doesn’t inherently make any more sense than treating it as 
==, and also doesn’t do what you want for nans. And, without keeping track of 
additional information, there’s no trivial way to get the behavior you want.

> If also this check returns `0`, there are two cases:
> 1. the objects are equal
> 2. elem_a (or also elem_b) are not orderable
> 
> In this case, there's no problem: the function will switch `elem_a` and 
> `elem_b`. If they are equal, it changes nothing.

No, it breaks the stability property. Two values can compare equal but be 
meaningful distinct. This is especially important when using a key 
function—e.g., the common “sort by column C, then by column B” idiom relies on 
the fact that two values that are equal in column B but different in column C 
remain in their column-C order.

> If `elem_a` is unorderable, this way at the end of the algorithm, `elem_a` 
> will be at the end of the iterable.

No it won’t. You can try it and see with nans with the key function above. Try 
tweaking it to return 0 on equal and any of the three choices for the remaining 
incomparable case—none of them do what you want. This is the wrong way to solve 
the problem.

More importantly, “orderable” isn’t a property of a single value, it’s a 
property of a pair of values. The fact that float has values that are 
incomparable with everything (including itself) and everything else is 
comparable with everything else, is a very special case. For most partially 
ordered types, there are values that can be compared with some values but not 
with others. What should happen if a and b are incomparable, but a>c while b<c? 
What if b and c are also incomparable?

So you can’t just say “push all the incomparable values to the right”. Except 
in the very special case of float, in which case you might as well say “push 
all the nan values to the right” in the first place, which is a lot simpler to 
explain, and to implement.

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MADCX3PJGPJIWIX7FUXFKHBYMFA4ZCBY/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to