Re: maximum() efficency
Just for completeness: The functions in Steve's original post named maximum calculate the minimum. Also, timing-wise, on my machine with a random list of 20 integers Steve's iteration version and Mitja's version are about equal, the system built-in is equal or slightly slower, and Paul's version about 3-4x slower. If the comparison function is very complex, the mileage may vary of course. -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Arne Ludwig [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Just for completeness: The functions in Steve's original post named maximum calculate the minimum. Also, timing-wise, on my machine with a random list of 20 integers Steve's iteration version and Mitja's version are about equal, the system built-in is equal or slightly slower, and Paul's version about 3-4x slower. If the comparison function is very complex, the mileage may vary of course. Arne - The version I posted (which is not mine by the way, but I've lost the original citation) becomes much more competitive when run with psyco. With your list of 200,000 integers, I believe it will outpace even the C library built-ins. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
On Mon, 27 Mar 2006 07:50:01 -0800, Arne Ludwig wrote: Just for completeness: The functions in Steve's original post named maximum calculate the minimum. Er, yes. I benchmarked them but clearly I should have function-tested them as well. Here is, I hope, my last word on maximum(). It figures out whether it got an iterator or a list, and does the right thing. I believe it will compile and run correctly whether your version of Python has iterators or not. And, it doesn't compute the minimum. def maximum(cmp, seq): Return maximum value of seq, as chosen by provided cmp() function. If seq contains multiple maximal values, return the last one. Arguments: cmp -- function reference; function should return values with the same semantics as Python's cmp() function. seq -- any sequence (a list or an iterator). If seq is empty, this function will raise ValueError. # is it a list? if hasattr(seq, __getitem__): if len(seq) == 0: raise ValueError, seq must contain at least 1 element maxval = seq[0] for v in seq: if cmp(maxval, v) = 0: maxval = v return maxval # is it an iterator? if hasattr(seq, __iter__) and hasattr(seq, next): try: maxval = seq.next() except StopIteration: raise ValueError, seq must contain at least 1 element for v in seq: if cmp(maxval, v) = 0: maxval = v return maxval raise TypeError, seq must be a list or an iterator -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list
maximum() efficency
I was looking at a Python function to find the maximum from a list. The original was more complicated; I simplified it. The built-in max() function can replace the simplified example, but not the original. def maximum(lst): maxval = lst[0] for i in xrange(1, len(lst)): v = lst[i] if maxval v: maxval = v return maxval Immediately I started thinking about ways to make this more efficient. My first thought was to use iterators: def maximum(lst): itr = iter(lst) maxval = itr.next() for v in itr: if maxval v: maxval = v return maxval Then I thought, I wonder if there is a faster solution that *doesn't* use iterators, for use with older versions of Python. I came up with: def maximum(lst): a = [] for v in lst: try: if a[0] v: a[0] = v except IndexError: a.append(v) return a[0] Of course, it's a little ugly to use a[0] instead of maxval. And I'm trying not to index a list, and here I am indexing another list. So: def maximum(lst): for v in lst: try: if maxval v: maxval = v except NameError: maxval = v return maxval And we come at last to my actual question: Is this guaranteed to work on all versions of Python? In my testing on my computer, the above function works perfectly. Even if you do this: maxval = 100 print maximum(cmp, [2, 0, 3, 1]) 3 In other words, you get a NameError in the function even if there is a global variable with that name. That's with Python 2.4, however... is this function guaranteed to work in all Python versions? I am very comfortable with the a[0] version, since I explicitly set a to an empty list, I know a[0] will always raise that IndexError. P.S. I went ahead and benchmarked the above functions, plus one more that is similar to the a[0] solution but used an empty dictionary instead of a zero-length list. I created a list of a million zeroes, and then stored a 1 at the beginning of the list. Thus the maximum functions would spend all their time iterating through values and comparing them, and very little time updating maxval or its equivalent. Times to run each function 100 times on the large list: 36.8 seconds xrange 22.3 seconds iterator 39.2 seconds IndexError on a[0] 31.5 seconds NameError with maxval 43.4 seconds KeyError on empty dictionary The conclusions I draw from these numbers: The sneaky tricks with exceptions don't bring enough speed to be worth using; two of them were actually slower than the xrange solution. Even the NameError one was only a little bit faster, and simple is better than complex, so I don't think I'd ever actually use it. The clear winner was the iterator version. It was much faster than the others, and in my opinion it is simpler and easier to understand than any of the others. -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Steve R. Hastings wrote: I was looking at a Python function to find the maximum from a list. The original was more complicated; I simplified it. The built-in max() function can replace the simplified example, but not the original. What's the original? Are you sure max can't solve it with an appropriate decorate-sort-undecorate (DSU) or the key= argument coming in Python 2.5? STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Steve R. Hastings wrote: I was looking at a Python function to find the maximum from a list. The original was more complicated; I simplified it. The built-in max() function can replace the simplified example, but not the original. But you forgot to shuw us the original... [snip several implementations] Of course, it's a little ugly to use a[0] instead of maxval. And I'm trying not to index a list, and here I am indexing another list. So: def maximum(lst): for v in lst: try: if maxval v: maxval = v except NameError: maxval = v return maxval In my testing on my computer, the above function works perfectly. Even if you do this: maxval = 100 print maximum(cmp, [2, 0, 3, 1]) 3 In other words, you get a NameError in the function even if there is a global variable with that name. That's with Python 2.4, however... is this function guaranteed to work in all Python versions? Not 100% sure, but I think yes. People with longer experience in python can answer definitely. Or you can try it out yourself. The reason it works is that this: def f(): print a a=7; f() works and prints 7 (global a is used), while this def f(): print a a = 12 a=7; f() does NOT work. Python notices that a is going to get assigned to inside f and treats it as a local variable. This is also the case with your code. Try out the above examples without a=7 and notice the different error messages... The clear winner was the iterator version. It was much faster than the others, and in my opinion it is simpler and easier to understand than any of the others. I would have done it in the same way, but probably without the iterators. I.e., like this: def maximum(lst): try: m = lst[0] except (TypeError, IndexError): raise Exception Non-sequence or empty sequence given to maximum) # (you forgot the above sanity checks in your code. # not strictly necessary, but nice to have.) for x in lst: if xm: m=x return m -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Steve R. Hastings [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I was looking at a Python function to find the maximum from a list. The original was more complicated; I simplified it. The built-in max() function can replace the simplified example, but not the original. If you are interested in both the min and max values, here is an algorithm that performs only 3 comparisons for every 2 list items, instead of the brute force method's 4 comparisons. This walks the input list in pairs, compares the two items to each other, then compares the lesser with the current min and the greater with the current max. def minMax(L): lenL = len(L) if lenL == 0: return None,None if lenL == 1: return L[0],L[0] min_ = max_ = L[0] if lenL % 2: i = 1 else: i = 0 while i lenL: a,b = L[i],L[i+1] if a b: a,b = b,a if a min_: min_ = a if b max_: max_ = b i += 2 return min_,max_ Of course, this much Python bytecode is not near as fast as simply calling the builtins min() and max(). But, if you add psyco to the mix, things aren't so cut-and-dried. I tested this method using a list of randomly-generated strings, and after the list length exceeds 100-500 or so, minMax starts to outperform the compiled min() and max(). The following table shows the timing for the brute force min() and max() calls, followed by minMax(): List length 1: 0.229 0.612 List length 2: 0.056 0.0001081 List length 10: 0.059 0.707 List length100: 0.154 0.087 List length500: 0.589 0.534 List length 1000: 0.0001176 0.670 List length 1: 0.0011485 0.0008954 List length 10: 0.0126720 0.0077379 Using the OP's test of 1 million zeros with the first zero changed to a 1, here is the performance of minMax vs. min() and max(): List length 100: 0.1235953 0.0126896 minMax is 10X faster! Ironically, it's faster calling minMax() than either min() or max(). If your list contains objects that are expensive to compare, the crossover list length may be much shorter. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
On Sun, 26 Mar 2006 20:34:28 +0200, Mitja Trampus wrote: I would have done it in the same way, but probably without the iterators. I.e., like this: def maximum(lst): try: m = lst[0] except (TypeError, IndexError): raise Exception Non-sequence or empty sequence given to maximum) # (you forgot the above sanity checks in your code. # not strictly necessary, but nice to have.) for x in lst: if xm: m=x return m I left out the original sanity check, because I just wanted a streamlined simple example. I had a nagging feeling that I was missing something simple, and you have put your finger on it. That's perfect! It's simple, it's clear, and it will work on any version of Python. Thanks! -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
On Sun, 26 Mar 2006 10:34:16 -0700, Steven Bethard wrote: What's the original? def minimum(cmp, lst): minimum(cmp, lst) Returns the minimal element in non-empty list LST with elements compared via CMP() which should return values with the same semantics as Python's cmp(). If there are several minimal elements, the last one is returned. if not lst: raise ValueError, 'empty list' minval = lst[0] for i in xrange(1, len(lst)): v = lst[i] if cmp(minval, v) 0: minval = v return minval This is from Google's goopy package. http://goog-goopy.sourceforge.net/ -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Actually, now that I think about it, the version using iter() has one advantage over your version: it will work correctly if passed either a list or an iterator. So, for versions of Python that have iterators, I'd use the iter() version, but for older versions of Python, I'd use your version. P.S. I benchmarked your version; it ran in 22.0 seconds, just a gnat's whisker faster than the iter() version. -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Steve R. Hastings [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Sun, 26 Mar 2006 10:34:16 -0700, Steven Bethard wrote: What's the original? def minimum(cmp, lst): minimum(cmp, lst) Returns the minimal element in non-empty list LST with elements compared via CMP() which should return values with the same semantics as Python's cmp(). If there are several minimal elements, the last one is returned. if not lst: raise ValueError, 'empty list' minval = lst[0] for i in xrange(1, len(lst)): v = lst[i] if cmp(minval, v) 0: minval = v return minval This is from Google's goopy package. http://goog-goopy.sourceforge.net/ -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha The doc string is not correct. If there are several minimal elements, the *first* one is returned. See this example: import math class Vector(object): def __init__(self,*x): # x is list of n-dimensional coordinates self.x = x def magnitude(self): return math.sqrt(sum(x**2 for x in self.x)) def __cmp__(self,other): return cmp(self.magnitude(), other.magnitude()) def __str__(self): return Vector(%s) % ,.join(str(x) for x in self.x) a = Vector(0,1) b = Vector(0,0) c = Vector(1,0) # these print 1, -1, and 0 to show that cmp() and Vector are doing the right thing print cmp(a,b) print cmp(b,a) print cmp(a,c) print min(a,b) # gives Vector(0,0), or b print max(a,b,c) # gives Vector(0,1) or a print min(a,c)# gives Vector(0,1) or a print minimum(Vector.__cmp__, [a,b]) # gives Vector(0,0), or b print minimum(Vector.__cmp__, [a,c])# gives Vector(0,1) or a -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
Steve R. Hastings wrote: On Sun, 26 Mar 2006 10:34:16 -0700, Steven Bethard wrote: What's the original? def minimum(cmp, lst): minimum(cmp, lst) Returns the minimal element in non-empty list LST with elements compared via CMP() which should return values with the same semantics as Python's cmp(). If there are several minimal elements, the last one is returned. if not lst: raise ValueError, 'empty list' minval = lst[0] for i in xrange(1, len(lst)): v = lst[i] if cmp(minval, v) 0: minval = v return minval This is from Google's goopy package. http://goog-goopy.sourceforge.net/ Ahh. Yeah, nasty cmp= arguments certainly do screw that up. If you're not just trying to implement a particular API, and you actually have a use-case you think you need this for, you might consider a key= argument instead of a cmp= argument. It'll probably be at least as fast, and Python 2.5 will have one on both min() and max(). It'll work something like: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/389659 STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: maximum() efficency
On Mon, 27 Mar 2006 00:11:04 +, Paul McGuire wrote: The doc string is not correct. The fault is mine, I'm afraid. I had thought I was copying that from an intact original version, but I must have edited it. I don't remember doing it, but I must have. To check, I went to Sourceforge and downloaded goopy again. Here, for the record, is the original Google maximum(): def maximum(cmp, lst): maximum(cmp, lst) Returns the maximal element in non-empty list LST with elements compared via CMP() which should return values with the same semantics as Python's cmp(). If there are several maximal elements, the last one is returned. if not lst: raise ValueError, 'empty list' maxval = lst[0] for i in xrange(1, len(lst)): v = lst[i] if cmp(maxval, v) = 0: maxval = v return maxval I apologize for the error. P.S. I'm happy to see that Google uses lst for a list. I do, too. I don't much like the convention of using L for a list, but I agree with Guido that using l is bad (looks too much like a one: 1), so lst it is. I would have put the cmp argument second, and made it default to the built-in Python cmp function if not specified. Oh, well. -- Steve R. HastingsVita est [EMAIL PROTECTED]http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list