On Jan 26, 10:07 pm, Steven D'Aprano <[EMAIL PROTECTED] cybersource.com.au> wrote: > On Sat, 26 Jan 2008 12:40:26 -0800, Paul Rubin wrote: > > [EMAIL PROTECTED] writes: > >> > def posmax(seq, key=lambda x:x): > >> > return max(enumerate(seq), key=lambda k: key(k[1]))[0] > > >> Is the Python max able to tell that's the identity function? I don't > >> think so, so your version may be slower and more memory hungry in the > >> common situation where you don't have a key function. So I think my > >> version is better :-) > > > I don't think there will be a noticable memory increase. It's not a DSU > > situation like in sorting, it's just a small constant parameter. Yes > > there might be a small speed hit. The compiler in principle could > > convert the (lambda k: lambda x: k[1]) to something like > > operator.itemgetter(1), but I doubt that it actually does so. > > Actually there is a big speed hit. Until such time that Python has a fast > identity function, and lambda x:x isn't it, your version is about three > times slower than Bearophile's. > > Much to my surprise, the fastest solution I've tried appears to be a pure > Python version not even using max() at all. > > Here are the three versions: > > def posmax1(seq, key=None): # bearophile's version > """Return the position of the first maximum item of a sequence. > It accepts the usual key parameter too.""" > if key: > return max(enumerate(seq), key=lambda k: key(k[1]))[0] > else: > return max(enumerate(seq), key=itemgetter(1))[0] > > def posmax2(seq, key=lambda x:x): # Paul Rubin's version > return max(enumerate(seq), key=lambda k: key(k[1]))[0] > > def posmax3(seq, key=None): # Pure Python version > if key is None: > m = seq[0] > index = 0 > for i, x in enumerate(seq): > if x > m: > m = x > index = i > return index > else: > return NotImplemented > > And my timing results: > > >>> alist = [3]*1000 + [5] + [3]*1000 > >>> assert alist.index(5) == 1000 > >>> assert posmax1(alist) == posmax2(alist) == posmax3(alist) == 1000 > > >>> min(timeit.Timer('posmax1(alist)', > > ... 'from __main__ import posmax1, alist').repeat(number=1000))/1000 > 0.00074387979507446289>>> min(timeit.Timer('posmax2(alist)', > > ... 'from __main__ import posmax2, alist').repeat(number=1000))/1000 > 0.0022983889579772949>>> min(timeit.Timer('posmax3(alist)', > > ... 'from __main__ import posmax3, alist').repeat(number=1000))/1000 > 0.00063846302032470701 > > -- > Steven
I don't think one should dismiss the simplest solution l.index(max(l)) out of hand (benchmark function borrowed from bearophile): =================== posmax.py ================== from operator import itemgetter def simple_posmax(l, key=None): if key: return l.index(max(l, key=key)) else: return l.index(max(l)) def clever_posmax(seq, key=None): if key: return max(enumerate(seq), key=lambda k: key(k[1]))[0] else: return max(enumerate(seq), key=itemgetter(1))[0] def posmax_benchmark(posmax): from time import clock alist = [3]*1000 + [5] + [3]*1000 t = clock() for _ in xrange(6000): r = posmax(alist) print posmax.__name__, round(clock() - t, 2), r if __name__ == "__main__": posmax_benchmark(clever_posmax) posmax_benchmark(simple_posmax) ============================================= marigold:python arno$ python posmax.py clever_posmax 3.25 1000 simple_posmax 0.95 1000 simple_posmax is more than 3x faster on my machine. It's not surprising as even though the list is walked twice, it is all done in C and no new objects have to be created. Then only non-C bit is when the result of max(l) is fed to l.index(). -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list