Re: [Python-ideas] a sorting protocol dunder method?

Barry Scott Wed, 06 Dec 2017 12:14:43 -0800


> On 5 Dec 2017, at 01:06, Chris Barker <chris.bar...@noaa.gov> wrote:
> 
> wow! a few time zones (and a day job) really make a difference to taking part 
> in a discussion :-)
> 
> This could be a good idea -- just putting it here for the record as it's 
> mentioned elsewhere.
> 
> I can't think of a way to profile this easily -- we know that having a key 
> function can be helpful, but that doesn't take into account the extra method 
> lookup -- maybe a key function that involves a method lookup??
> 
> If it defines both, it isn't clear which will be used for sorting.
> Should __lt__ take priority, or __key__? Whichever we choose, somebody
> is going to be upset and confused by the choice.
> 
> __sort_key__ would take priority -- that is a no brainer, it's the sort key, 
> it's used for sorting. And __lt__ giving a different result is no more 
> surprising, and probably less surprising, than total ordering being violated 
> in  any other way.


If by no brainer you mean the performance of __sort-key__ is always better of 
__lt__ then I will wask for a proof in the form of benchmarks with enough 
use-case coverage.

> [I got very similar results as Barry with his version: about 5X faster with 
> the key function]
> 
> def outer_key(item):
>     return item.key()
> 
> so we get a function lookup each time it's used.
> 
> However, I'm confused by the results -- essentially NO Change. That extra 
> method lookup is coming essentially for free. And this example is using a 
> tuple as the key, so not the very cheapest possible to sort key.
> 
> Did I make a mistake? is that lookup somehow cached?
> 
> In [36]: run sort_key_test.py
> 10000
> key       0.012529s  10000 calls
> outer_key 0.012139s  10000 calls
> lt        0.048057s 119877 calls
> 
> each run gives different results, but the lt method is always on order of 5X 
> slower for this size list. Sometimes out_key is faster, mostly a bit slower, 
> than key.
> 
> Also, I tried making a "simpler" __lt__ method:
> 
> return (self.value1, self.value2) < (other.value1, other.value2)
> 
> but it was bit slower than the previous one -- interesting.

This is more expensive to execute then my version for 2 reasons.
1) my __lt__ did not need to create any tuples.
2) my __lt__ can exit after only looking at the value1's

> 
> Then I tried a simpler (but probably common) simple attribute sort:
> 
>     def __lt__(self, other):
>         global lt_calls
>         lt_calls += 1
> 
>         return self.value1 < other.value1
> 
>     def key(self):
>         global key_calls
>         key_calls += 1
> 
>         return self.value1
> 
> And that results in about a 6X speedup
> 
> In [42]: run sort_key_test.py
> 10000
> key       0.005157s  10000 calls
> outer_key 0.007000s  10000 calls
> lt        0.041454s 119877 calls
> time ratio: 5.922036784741144
> 
> 
> And, interestingly (t me!) there is even a performance gain for only  a 10 
> item list! (1.5X or so, but still)

My guess is that this is because the __lt__ test on simple types is very fast 
in python.

Barry

_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] a sorting protocol dunder method?

Reply via email to