On 3/12/2011 8:28 PM, Steven D'Aprano wrote:
Fredrik Johansson wrote:
Consider sorting a list of pairs representing fractions. This can be
done easily in Python 2.x with the comparison function lambda
(p,q),(r,s): cmp(p*s, q*r). In Python 2.6, this is about 40 times
faster than using fractions.Fraction as a key function.
[steve@sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
(9,100), (3,7), (2,8)]" "sorted(L, lambda (p,q),(r,s): cmp(p*s, q*r))"
10000 loops, best of 3: 25.1 usec per loop
[steve@sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
(9,100), (3,7), (2,8)]" -s "from fractions import Fraction" "sorted(L,
key=lambda t: Fraction(*t))"
1000 loops, best of 3: 236 usec per loop
So for a short list, I get a factor of ten difference. For a longer
list, I'd expect the key function to win out. Much to my astonishment,
it doesn't -- I get similar results regardless of the size of L.
Size of L key/cmp
========== =========
6 9.4
600 13.9
60000 7.0
6000000 6.7
Interesting. Comparing two Fractions must do the same thing as the cmp
function, compare two products, but it does so with a *lot* more overhead:
571 def _richcmp(self, other, op):
572 """Helper for comparison operators, for internal use
574 Implement comparison between a Rational instance and
575 either another Rational instance or a float `other`. If
576 `other` is not a Rational instance or a float, return
577 NotImplemented. `op` should be one of the six standard
578 comparison operators.
580 """
581 # convert other to a Rational instance where reasonable.
582 if isinstance(other, numbers.Rational):
583 return op(self._numerator * other.denominator,
584 self._denominator * other.numerator)
585 if isinstance(other, float):
586 if math.isnan(other) or math.isinf(other):
587 return op(0.0, other)
588 else:
589 return op(self, self.from_float(other))
590 else:
591 return NotImplemented
592
593 def __lt__(a, b):
594 """a < b"""
595 return a._richcmp(b, operator.lt)
For this example, and any with suitably restricted values, key=float
would work and should beat the cmp version. But of course, someone will
have a use case for which that will not work. (But then, one could do a
near linear scan to properly re-sort slices with equal float keys.)
Hmm. As I remember, one rationale for deleting cmp is that nlogn slow
cmp() calls are slower than n slow key() calls, but that only matters if
the nlogn '<' comparisions of stored keys are fast compared to cmp calls
(as tends to be the case when the keys are builtin class instances).
But in this case, they are much slower. To be faster, one would need
something like "key=lambda p,q:p*(lcm//q)", where lcm is the least
common multiple of of all the q's (denominators). For the example above,
lcm = 700. But for longer lists, it could be outrageously large.
--
Terry Jan Reedy
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com