Re: flatten a level one list
Michael Spencer wrote: Peter Otten wrote: If you require len(xdata) == len(ydata) there's an easy way to move the loop into C: def flatten7(): n = len(xdata) assert len(ydata) == n result = [None] * (2*n) result[::2] = xdata result[1::2] = ydata return result $ python -m timeit 'from flatten import flatten6 as f' 'f()' 1000 loops, best of 3: 847 usec per loop $ python -m timeit 'from flatten import flatten7 as f' 'f()' 1 loops, best of 3: 43.9 usec per loop Very nice observation, Peter. Thank you :-) Easily the winner. It reminds me of str.translate beating python loops in filtering applications: http://groups.google.com/group/comp.lang.python/msg/e23cdc374144a4bd What's more, you can generalize your approach to any number of sequences and un-equal lengths, with only modest loss of speed: def interleave(*args, **kw): Peter Otten flatten7 (generalized by Michael Spencer) Interleave any number of sequences, padding shorter sequences if kw pad is supplied dopad = pad in kw pad = dopad and kw[pad] count = len(args) lengths = map(len, args) maxlen = max(lengths) result = maxlen*count*[None] for ix, input in enumerate(args): try: result[ix::count] = input except ValueError: if dopad: result[ix::count] = input + [pad]*(maxlen-lengths[ix]) else: raise return result You don't loose speed because the expensive operation is only executed if list lengths differ. Two observations: - list arithmetic like 'input + [pad]*(maxlen-lengths[ix])' should and can be avoided - You are padding twice -- once with None, and then with the real thing. def interleave2(*args, **kw): dopad = pad in kw pad = kw.get(pad) count = len(args) lengths = map(len, args) maxlen = max(lengths) if not dopad and min(lengths) != maxlen: raise ValueError result = maxlen*count*[pad] for ix, input in enumerate(args): result[ix:len(input)*count:count] = input return result $ python -m timeit -s 'from interleave_spencer import xdata, ydata, interleave as i;xdata=xdata[:-1]' 'i(xdata, ydata, pad=None)' 1 loops, best of 3: 69.7 usec per loop $ python -m timeit -s 'from interleave_spencer import xdata, ydata, interleave2 as i;xdata=xdata[:-1]' 'i(xdata, ydata, pad=None)' 1 loops, best of 3: 46.4 usec per loop Not overwhelming, but I expect the difference to grow when the arguments occupy a significant portion of the available memory. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
[EMAIL PROTECTED] wrote: Creating a list via list/map/filter just for the side effect is not only bad taste, ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];ext=lst.extend' 'for i in a: ext(i)' 100 loops, best of 3: 1.23 usec per loop ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];filter(lst.extend, a)' 100 loops, best of 3: 1.63 usec per loop it is also slower than an explicit loop. Don't do it. Hi, but I found this result instead : [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[None]*len(a)*2; e=l.extend l[::2]=b;l[1::2]=b 100 loops, best of 3: 6.22 msec per loop [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[]; e=l.extend filter(e,b) 100 loops, best of 3: 7.25 msec per loop [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[]; e=l.extend for x in b: e(x) 100 loops, best of 3: 10.7 msec per loop [EMAIL PROTECTED]:~$ So it seems to be faster than explicit loop. By localizing the l.extend name binding, its speed is only 20% slower than the fastest method. May be I have done something wrong in the test ? I hate to admit it but /my/ post had a bug: zip([range(1000)]*2) should have been zip(*[range(1000)]*2). filter() may be ugly, but faster it is. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
[EMAIL PROTECTED] wrote: David Murmann wrote: # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l well, i would really like to take credit for these, but they're not mine ;) (credit goes to Michael Spencer). i especially like flatten4, even if its not as fast as the phenomenally faster flatten7. Me too. And expand a bit on flatten4, I got this interesting result. [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools; a=zip(xrange(1000),xrange(1000)) l=len(a); li=[None]*l*2;li[::2]=range(1000); li[1::2]=range(1000) 1000 loops, best of 3: 318 usec per loop [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools,psyco; a=zip(xrange(1000),xrange(1000));li=[] filter(li.extend,a) 1000 loops, best of 3: 474 usec per loop For a fair comparison you'd have to time the zip operation. Still 50% slower but it has the advantage that it works on all kinds of sequence as they can be efficiently izip() together. Creating a list via list/map/filter just for the side effect is not only bad taste, ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];ext=lst.extend' 'for i in a: ext(i)' 100 loops, best of 3: 1.23 usec per loop ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];filter(lst.extend, a)' 100 loops, best of 3: 1.63 usec per loop it is also slower than an explicit loop. Don't do it. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Peter Otten wrote: [EMAIL PROTECTED] wrote: David Murmann wrote: # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l well, i would really like to take credit for these, but they're not mine ;) (credit goes to Michael Spencer). i especially like flatten4, even if its not as fast as the phenomenally faster flatten7. Me too. And expand a bit on flatten4, I got this interesting result. [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools; a=zip(xrange(1000),xrange(1000)) l=len(a); li=[None]*l*2;li[::2]=range(1000); li[1::2]=range(1000) 1000 loops, best of 3: 318 usec per loop [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools,psyco; a=zip(xrange(1000),xrange(1000));li=[] filter(li.extend,a) 1000 loops, best of 3: 474 usec per loop For a fair comparison you'd have to time the zip operation. Still 50% slower but it has the advantage that it works on all kinds of sequence as they can be efficiently izip() together. Creating a list via list/map/filter just for the side effect is not only bad taste, ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];ext=lst.extend' 'for i in a: ext(i)' 100 loops, best of 3: 1.23 usec per loop ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];filter(lst.extend, a)' 100 loops, best of 3: 1.63 usec per loop it is also slower than an explicit loop. Don't do it. Hi, but I found this result instead : [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[None]*len(a)*2; e=l.extend l[::2]=b;l[1::2]=b 100 loops, best of 3: 6.22 msec per loop [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[]; e=l.extend filter(e,b) 100 loops, best of 3: 7.25 msec per loop [EMAIL PROTECTED]:~$ python ~/lib/python2.4/timeit.py -s a=range(1); b=zip(*[a]*2); l=[]; e=l.extend for x in b: e(x) 100 loops, best of 3: 10.7 msec per loop [EMAIL PROTECTED]:~$ So it seems to be faster than explicit loop. By localizing the l.extend name binding, its speed is only 20% slower than the fastest method. May be I have done something wrong in the test ? -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Peter Otten wrote: [EMAIL PROTECTED] wrote: David Murmann wrote: # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l well, i would really like to take credit for these, but they're not mine ;) (credit goes to Michael Spencer). i especially like flatten4, even if its not as fast as the phenomenally faster flatten7. Me too. And expand a bit on flatten4, I got this interesting result. [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools; a=zip(xrange(1000),xrange(1000)) l=len(a); li=[None]*l*2;li[::2]=range(1000); li[1::2]=range(1000) 1000 loops, best of 3: 318 usec per loop [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools,psyco; a=zip(xrange(1000),xrange(1000));li=[] filter(li.extend,a) 1000 loops, best of 3: 474 usec per loop For a fair comparison you'd have to time the zip operation. Still 50% slower but it has the advantage that it works on all kinds of sequence as they can be efficiently izip() together. Creating a list via list/map/filter just for the side effect is not only bad taste, ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];ext=lst.extend' 'for i in a: ext(i)' 100 loops, best of 3: 1.23 usec per loop ~ $ python -m timeit -s'a = zip([range(1000)]*2)' 'lst=[];filter(lst.extend, a)' 100 loops, best of 3: 1.63 usec per loop it is also slower than an explicit loop. Don't do it. ah, stand corrected. -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Peter Otten wrote: .. - You are padding twice -- once with None, and then with the real thing. def interleave2(*args, **kw): dopad = pad in kw pad = kw.get(pad) count = len(args) lengths = map(len, args) maxlen = max(lengths) if not dopad and min(lengths) != maxlen: raise ValueError result = maxlen*count*[pad] for ix, input in enumerate(args): result[ix:len(input)*count:count] = input return result $ python -m timeit -s 'from interleave_spencer import xdata, ydata, interleave as i;xdata=xdata[:-1]' 'i(xdata, ydata, pad=None)' 1 loops, best of 3: 69.7 usec per loop $ python -m timeit -s 'from interleave_spencer import xdata, ydata, interleave2 as i;xdata=xdata[:-1]' 'i(xdata, ydata, pad=None)' 1 loops, best of 3: 46.4 usec per loop Not overwhelming, but I expect the difference to grow when the arguments occupy a significant portion of the available memory. Peter very nice indeed; another generalization is to allow truncation as well def interleave(*args,**kw): Peter Otten flatten7 (generalized by Michael Spencer and Robin Becker) Interleave any number of sequences, padding shorter sequences if kw pad is supplied or truncating if truncate=True is specified interleave([1,3,5], [2,4,6]) == [1,2,3,4,5,6] True interleave([1,2,3]) == [1,2,3] True interleave(*[[1,2,3]]*10) == [1]*10+[2]*10+[3]*10 True interleave(range(0,1000,2),range(1,1000,2)) == range(1000) True interleave([1,2],[3,4,5]) Traceback (most recent call last): ... ValueError: Minimum length=2 != Max length=3 interleave([1,3],[2,4,6], pad = None) == [1,2,3,4,None,6] True interleave([1,3],[2,4,6],truncate=True) == [1,2,3,4] True interleave([1,2],[3,4,5],pad='aaa',truncate=1) Traceback (most recent call last): ... AssertionError: Cannot specify both truncate=True and pad='aaa' dopad = pad in kw pad = kw.get(pad) dotrunc = bool(kw.get('truncate',False)) assert not (dotrunc and pad), \ 'Cannot specify both truncate=True and pad=%r' % pad count = len(args) lengths = map(len,args) maxlen = max(lengths) minlen = min(lengths) if dotrunc: result = minlen*count*[None] for ix, input in enumerate(args): result[ix::count] = input[:minlen] else: if not dopad and minlen!=maxlen: raise ValueError('Minimum length=%d != Max length=%d' % (minlen,maxlen)) result = maxlen*count*[pad] for ix, input in enumerate(args): result[ix:len(input)*count:count] = input return result def _test(): import doctest, interleave return doctest.testmod(interleave) if __name__ == __main__: _test() -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Alex Martelli [EMAIL PROTECTED] wrote: Nick Craig-Wood [EMAIL PROTECTED] wrote: Except if you are trying to sum arrays of strings... sum([a,b,c], ) Traceback (most recent call last): File stdin, line 1, in ? TypeError: sum() can't sum strings [use ''.join(seq) instead] I've no idea why this limitation is here... perhaps it is because pre python2.4 calling += on strings was very slow? No: when I implemented sum, I originally specialcased sum on strings to map down to a ''.join -- Guido decided it was confusing and had no advantage wrt calling ''.join directly so he made me put in that check. Hmm, interesting... I agree with Guido about the special case, but I disagree about the error message. Not being able to use sum(L,) reduces the orthogonality of sum for no good reason I could see. However I only noticed that when trying to do the recent python challenge which probaby isn't the best use case ;-) -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Nick Craig-Wood [EMAIL PROTECTED] wrote: ... I agree with Guido about the special case, but I disagree about the error message. Not being able to use sum(L,) reduces the orthogonality of sum for no good reason I could see. Having sum(L,'') work but be O(N squared) would be an attractive nuisance within the meaning of the law; it's bad enough that sum(L,[]) etc are that way, but at least beginners want to concatenate lists (or tuples, arrays, etc) much less often than they want to concatenate strings. sum is meant to work on _numbers_ -- the reason it takes that second optional parameter is essentially to be able to specify the ``type'' of numbers. In retrospect it might have been better to have a single argument (or interpret multiple ones like min or max do, maybe), forcing the starting value to be integer 0. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Michael Spencer wrote: result[ix::count] = input + [pad]*(maxlen-lengths[ix]) Peter Otten rewrote: result[ix:len(input)*count:count] = input Quite so. What was I thinking? Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
On Fri, 13 Jan 2006 07:48:39 -0800, [EMAIL PROTECTED] (Alex Martelli) wrote: Nick Craig-Wood [EMAIL PROTECTED] wrote: ... I agree with Guido about the special case, but I disagree about the error message. Not being able to use sum(L,) reduces the orthogonality of sum for no good reason I could see. I agree. Having sum(L,'') work but be O(N squared) would be an attractive Are you saying ''.join is O(N squared)? Or that sum would not be allowed to used the same/equivalent implementation algorithms if it were allowed to work on strings? nuisance within the meaning of the law; it's bad enough that sum(L,[]) etc are that way, but at least beginners want to concatenate lists (or tuples, arrays, etc) much less often than they want to concatenate strings. sum is meant to work on _numbers_ -- the reason it takes that If sum is meant to work on _numbers_, why not call it numsum, to suggest the restriction? ISTM the usual assumption in python is non-restrictive duck typing, and the expectation would be no more restriction than in reduce(operator.add, L, init_val). second optional parameter is essentially to be able to specify the ``type'' of numbers. In retrospect it might have been better to have a single argument (or interpret multiple ones like min or max do, maybe), forcing the starting value to be integer 0. So it was an accident that user types are permitted? class C: ... def __getattr__(self, attr): print attr;raise AttributeError ... sum([C(),C()]) __coerce__ __radd__ Traceback (most recent call last): File stdin, line 1, in ? TypeError: unsupported operand type(s) for +: 'int' and 'instance' sum([C(),C()],C()) __coerce__ __add__ __coerce__ __radd__ Traceback (most recent call last): File stdin, line 1, in ? TypeError: unsupported operand type(s) for +: 'instance' and 'instance' Yet user subclassing of str does not yield an acceptable user type? S = type('S',(str,),{}) sum(S(i) for i in xrange(5)) Traceback (most recent call last): File stdin, line 1, in ? TypeError: unsupported operand type(s) for +: 'int' and 'S' sum(S(i) for i in xrange(5), S('start:')) File stdin, line 1 SyntaxError: invalid syntax sum((S(i) for i in xrange(5)), S('start:')) Traceback (most recent call last): File stdin, line 1, in ? TypeError: sum() can't sum strings [use ''.join(seq) instead] Although, notice class SI(str): ... def __add__(self, other): return SI(str.__add__(self, str(other))) ... __radd__ = __add__ ... sum((SI(i) for i in xrange(5))) '001234' But: sum((SI(i) for i in xrange(5)), SI('start:')) Traceback (most recent call last): File stdin, line 1, in ? TypeError: sum() can't sum strings [use ''.join(seq) instead] vs reduce(operator.__add__, (SI(i) for i in xrange(5)), SI('start:')) 'start:01234' Which seems to boil down to incomplete restrictions on duck typing. Maybe Guido will want to complete it, but ISTM your original implementation delegating string addition implementation to ''.join was reasonable. Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
I added my own function to the benchmark of Robin Becker:from itertools import chaindef flatten9(x, y): return list(chain(*izip(x, y)))Results: no psyco Name 10 20100200500 1000 flatten1104.499199.699854.301 1673.102 4084.301 8078.504 flatten2111.103204.706944.901 1778.793 4554.701 8773.494 flatten3174.594310.302 1526.308 2880.001 7332.49212373.209 flatten4115.204156.093467.2058 53.705 1920.795 2755.713 flatten5 79.894117.803406.504762.892 1764.297 2663.898 flatten6136.399246.596 1142.406 2243.400 5494.809 8625.221 flatten6a163.889279.689 1320.195 2691.817 6481.910 9879.899 flatten6b175.881275.111 1220.393 2440.596 5955.291 8979.106 flatten6c160.813272.989 1138.186 2472.591 5726.314 8415.699 flatten6d126.004215.292988.603 1932.383 4734.492 7447.696 flatten7 37.217 43.297 89.407134.897233.006343.013 flatten8 93.198190.306 1739.597 4987.90727208.01878883.505flat ten8a112.915220.299 1875.997 5491.59028395.31981628.394 flatten9 98.896159.812 651.288 1153.994 2980.685 3927.398Unfortunatly I can't test with psyco for the moment...Cyril -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Another try: def flatten6(x, y): return list(chain(*izip(x, y))) (any case, this is shorter ;-) Cyril On 1/12/06, Michael Spencer [EMAIL PROTECTED] wrote: Tim Hochberg wrote: Michael Spencer wrote: Robin Becker schrieb: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. ... David Murmann wrote: Some functions and timings ... Here's one more that's quite fast using Psyco, but only average without it. def flatten6(): n = min(len(xdata), len(ydata)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] -tim Indeed: I added yours to the list (after adding the appropriate return) testthem() timethem() flatten1(...) 702 iterations, 0.71msec per call flatten2(...) 641 iterations, 0.78msec per call flatten3(...) 346 iterations, 1.45msec per call flatten4(...) 1447 iterations, 345.66usec per call flatten5(...) 1218 iterations, 410.55usec per call flatten6(...) 531 iterations, 0.94msec per call (See earlier post for flatten1-5) Michael -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Paul Rubin wrote: Paul Rubin http://[EMAIL PROTECTED] writes: import operator a=[(1,2),(3,4),(5,6)] reduce(operator.add,a) (1, 2, 3, 4, 5, 6) (Note that the above is probably terrible if the lists are large and you're after speed.) yes, and it is all in C and so could be a contender for the speed champ. I guess what you're saying is that it's doing (1,2) (1,2)+(3,4) (1,2,3,4)+(5,6) ie we do n or n-1 tuple additions each of which requires tuple allocation etc etc A fast implementation would probably allocate the output list just once and then stream the values into place with a simple index. -- Robin Becker -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Tim Hochberg wrote: Here's one more that's quite fast using Psyco, but only average without it. def flatten6(): n = min(len(xdata), len(ydata)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] I you require len(xdata) == len(ydata) there's an easy way to move the loop into C: def flatten7(): n = len(xdata) assert len(ydata) == n result = [None] * (2*n) result[::2] = xdata result[1::2] = ydata return result $ python -m timeit 'from flatten import flatten6 as f' 'f()' 1000 loops, best of 3: 847 usec per loop $ python -m timeit 'from flatten import flatten7 as f' 'f()' 1 loops, best of 3: 43.9 usec per loop Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker [EMAIL PROTECTED] writes: reduce(operator.add,a) ... A fast implementation would probably allocate the output list just once and then stream the values into place with a simple index. That's what I hoped sum would do, but instead it barfs with a type error. So much for duck typing. -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker wrote: Paul Rubin wrote: Paul Rubin http://[EMAIL PROTECTED] writes: import operator a=[(1,2),(3,4),(5,6)] reduce(operator.add,a) (1, 2, 3, 4, 5, 6) (Note that the above is probably terrible if the lists are large and you're after speed.) yes, and it is all in C and so could be a contender for the speed champ. I guess what you're saying is that it's doing That is what I thought too but seems that [x for pair in li for x in pair] is the fastest on my machine and what is even stranger is that if I use psyco.full(), I got a 10x speed up for this solution(list comprehension) which is head and shoulder above all the other suggested so far. -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Well, maybe it's time to add a n-levels flatten() function to the language (or to add it to itertools). Python is open source, but I am not able to modify its C sources yet... Maybe Raymond Hettinger can find some time to do it for Py 2.5. Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
[Robin Becker] Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. The problem arises in coneverting lists of (x,y) coordinates into a single list of coordinates eg f([(x0,y0),(x1,y1),]) -- [x0,y0,x1,y1,] Here's one way: d = [('x0','y0'), ('x1','y1'), ('x2','y2'), ('x3', 'y3')] list(chain(*d)) ['x0', 'y0', 'x1', 'y1', 'x2', 'y2', 'x3', 'y3'] FWIW, if you're into working out puzzles, there's no end of interesting iterator algebra tricks. Here are a few identities for your entertainment: # Given s (any sequence) and n (a non-negative integer): assert zip(*izip(*tee(s,n))) == [tuple(s)]*n assert list(chain(*tee(s,n))) == list(s)*n assert map(itemgetter(0),groupby(sorted(s))) == sorted(set(s)) Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
In article [EMAIL PROTECTED], Paul Rubin http://[EMAIL PROTECTED] wrote: Robin Becker [EMAIL PROTECTED] writes: reduce(operator.add,a) ... That's what I hoped sum would do, but instead it barfs with a type error. So much for duck typing. sum(...) sum(sequence, start=0) - value If you're using sum() as a 1-level flatten you need to give it start=[]. -- \S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/ ___ | Frankly I have no feelings towards penguins one way or the other \X/ |-- Arthur C. Clarke her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Sion Arrowsmith [EMAIL PROTECTED] writes: sum(sequence, start=0) - value If you're using sum() as a 1-level flatten you need to give it start=[]. Oh, right, I should have remembered that. Thanks. Figuring out whether it's quadratic or linear would still take an experiment or code inspection which I'm not up for at the moment. -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Peter Otten wrote: Tim Hochberg wrote: Here's one more that's quite fast using Psyco, but only average without it. def flatten6(): n = min(len(xdata), len(ydata)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] I you require len(xdata) == len(ydata) there's an easy way to move the loop into C: def flatten7(): n = len(xdata) assert len(ydata) == n result = [None] * (2*n) result[::2] = xdata result[1::2] = ydata return result $ python -m timeit 'from flatten import flatten6 as f' 'f()' 1000 loops, best of 3: 847 usec per loop $ python -m timeit 'from flatten import flatten7 as f' 'f()' 1 loops, best of 3: 43.9 usec per loop Peter That's the winner for my machine and it works in the case I need :) The numbers are microseconds/call I used 20 reps and n from 10 up to 1000. no psyco Name1020 100 200 500 1000 flatten1 111.383 189.298 745.709 1397.300 3499.579 6628.775 flatten2 142.923 209.496 907.182 1521.618 3565.397 7197.228 flatten3 176.224 314.342 1385.958 2733.560 6726.693 12879.067 flatten4 112.696 163.010 518.250 901.288 1979.749 3657.364 flatten578.334 110.768 386.949 711.794 1617.664 3255.749 flatten6 142.867 230.420 894.639 1767.012 4499.734 9017.906 flatten6a 163.093 263.330 1071.337 2084.287 5209.433 10383.610 flatten6b 180.582 275.761 1063.794 2074.705 5057.123 10043.567 flatten6c 167.898 253.664 974.202 1948.181 4821.339 9562.780 flatten6d 132.475 201.702 738.194 1406.659 3612.107 7242.038 flatten759.03062.35490.347 130.771 254.613 438.994 flatten888.978 173.737 1667.111 5674.297 28907.501 106330.749 flatten8a 107.388 225.951 2323.563 7088.136 34254.381 114538.384 psyco Name1020 100 200 500 1000 flatten184.424 114.596 393.374 714.728 1809.448 3197.837 flatten2 102.387 136.302 507.243 942.494 2276.770 4451.990 flatten385.206 111.020 379.713 715.957 1607.104 3191.188 flatten4 102.667 144.599 509.255 856.450 1839.591 3425.128 flatten579.898 115.490 383.904 730.484 1739.411 3515.978 flatten654.56061.293 183.012 332.109 837.146 1604.366 flatten6a79.647 108.114 405.107 752.917 1873.674 3824.620 flatten6b 111.746 132.978 473.189 907.378 2217.600 4257.357 flatten6c98.756 110.629 376.724 730.037 1772.963 3524.247 flatten6d59.25369.199 172.731 295.820 717.577 1402.720 flatten751.29139.75465.707 104.902 233.214 405.694 flatten887.050 166.837 1665.407 5410.576 28459.567 107847.422 flatten8a 122.753 251.457 2766.944 7931.204 36353.503 120773.674 ### from itertools import izip import timeit _R=100 def flatten1(x, y): '''D Murman''' return [i for pair in izip(x, y) for i in pair] def flatten2(x, y): '''D Murman''' return [i for pair in zip(x, y) for i in pair] def flatten3(x, y): '''D Murman''' res = [] for pair in izip(x, y): for i in pair: res.append(i) return res # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l from Tkinter import _flatten def flatten5(x, y): '''D Murman''' return list(_flatten(zip(x, y))) def flatten6(x,y): '''Tim Hochberg''' n = min(len(x), len(y)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] return result def flatten6a(x,y): '''Robin Becker variant of 6''' n = min(len(x), len(y)) result = [None] * (2*n) for i in xrange(n): result[2*i:2*i+2] = xdata[i],ydata[i] return result def flatten6b(x,y): '''Robin Becker variant of 6''' n = min(len(x), len(y)) result = [None] * (2*n) for i,pair in enumerate(zip(xdata,ydata)): result[2*i:2*i+2] = pair return result def flatten6c(x,y): '''Robin Becker variant of 6''' n = min(len(x), len(y)) result = [None] * (2*n) for i,pair in enumerate(izip(xdata,ydata)): result[2*i:2*i+2] = pair return result def flatten6d(x,y): '''Robin Becker variant of 6''' n = min(len(x), len(y)) result = [None] * (2*n) j = 0 for i in xrange(n): result[j] = xdata[i] result[j+1] = ydata[i] j+=2 return result from operator import add as operator_add def flatten8(x,y): '''Paul Rubin''' return reduce(operator_add,zip(x,y),()) def flatten8a(x,y): '''Robin Becker variant of 8''' return reduce(operator_add,(xy for xy in izip(x,y)),()) def flatten7(x,y): '''Peter Otten
Re: flatten a level one list
Robin Becker schrieb: # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l from Tkinter import _flatten def flatten5(x, y): '''D Murman''' return list(_flatten(zip(x, y))) well, i would really like to take credit for these, but they're not mine ;) (credit goes to Michael Spencer). i especially like flatten4, even if its not as fast as the phenomenally faster flatten7. -- David Murmann (NN!) ;) -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Sion Arrowsmith [EMAIL PROTECTED] wrote: sum(...) sum(sequence, start=0) - value If you're using sum() as a 1-level flatten you need to give it start=[]. Except if you are trying to sum arrays of strings... sum([a,b,c], ) Traceback (most recent call last): File stdin, line 1, in ? TypeError: sum() can't sum strings [use ''.join(seq) instead] I've no idea why this limitation is here... perhaps it is because pre python2.4 calling += on strings was very slow? -- Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Peter Otten wrote: I you require len(xdata) == len(ydata) there's an easy way to move the loop into C: def flatten7(): n = len(xdata) assert len(ydata) == n result = [None] * (2*n) result[::2] = xdata result[1::2] = ydata return result $ python -m timeit 'from flatten import flatten6 as f' 'f()' 1000 loops, best of 3: 847 usec per loop $ python -m timeit 'from flatten import flatten7 as f' 'f()' 1 loops, best of 3: 43.9 usec per loop Peter Very nice observation, Peter. Easily the winner. It reminds me of str.translate beating python loops in filtering applications: http://groups.google.com/group/comp.lang.python/msg/e23cdc374144a4bd What's more, you can generalize your approach to any number of sequences and un-equal lengths, with only modest loss of speed: def interleave(*args, **kw): Peter Otten flatten7 (generalized by Michael Spencer) Interleave any number of sequences, padding shorter sequences if kw pad is supplied dopad = pad in kw pad = dopad and kw[pad] count = len(args) lengths = map(len, args) maxlen = max(lengths) result = maxlen*count*[None] for ix, input in enumerate(args): try: result[ix::count] = input except ValueError: if dopad: result[ix::count] = input + [pad]*(maxlen-lengths[ix]) else: raise return result def test_interleave(): assert interleave([1,3,5], [2,4,6]) == [1,2,3,4,5,6] assert interleave([1,2,3]) == [1,2,3] assert interleave(*[[1,2,3]]*10) == [1]*10+[2]*10+[3]*10 assert interleave(range(0,1000,2),range(1,1000,2)) == range(1000) def raises(func,*args, **kw): exc = kw.get(exc, Exception) try: func(*args) except exc: return True assert raises(interleave,[1,2],[3,4,5]) assert interleave([1,3],[2,4,6], pad = None) == [1,2,3,4,None,6] def timethem(): for n in [1000, 1, 10, 100]: x = range(n); y = range(n) print List length: ,n for func in [flatten7, interleave]: print shell.timefunc(func, x, y) timethem() List length: 1000 flatten7(...) 6442 iterations, 77.62usec per call interleave(...) 5475 iterations, 91.33usec per call List length: 1 flatten7(...) 318 iterations, 1.57msec per call interleave(...) 315 iterations, 1.59msec per call List length: 10 flatten7(...) 25 iterations, 20.61msec per call interleave(...) 24 iterations, 21.06msec per call List length: 100 flatten7(...) 3 iterations, 198.51msec per call interleave(...) 3 iterations, 198.91msec per call Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Nick Craig-Wood [EMAIL PROTECTED] wrote: Sion Arrowsmith [EMAIL PROTECTED] wrote: sum(...) sum(sequence, start=0) - value If you're using sum() as a 1-level flatten you need to give it start=[]. Except if you are trying to sum arrays of strings... sum([a,b,c], ) Traceback (most recent call last): File stdin, line 1, in ? TypeError: sum() can't sum strings [use ''.join(seq) instead] I've no idea why this limitation is here... perhaps it is because pre python2.4 calling += on strings was very slow? No: when I implemented sum, I originally specialcased sum on strings to map down to a ''.join -- Guido decided it was confusing and had no advantage wrt calling ''.join directly so he made me put in that check. Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
David Murmann wrote: Robin Becker schrieb: # New attempts: from itertools import imap def flatten4(x, y): '''D Murman''' l = [] list(imap(l.extend, izip(x, y))) return l from Tkinter import _flatten def flatten5(x, y): '''D Murman''' return list(_flatten(zip(x, y))) well, i would really like to take credit for these, but they're not mine ;) (credit goes to Michael Spencer). i especially like flatten4, even if its not as fast as the phenomenally faster flatten7. Me too. And expand a bit on flatten4, I got this interesting result. [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools; a=zip(xrange(1000),xrange(1000)) l=len(a); li=[None]*l*2;li[::2]=range(1000); li[1::2]=range(1000) 1000 loops, best of 3: 318 usec per loop [EMAIL PROTECTED]:~/bonobo/psp$ python ~/lib/python2.4/timeit.py -s import itertools,psyco; a=zip(xrange(1000),xrange(1000));li=[] filter(li.extend,a) 1000 loops, best of 3: 474 usec per loop Still 50% slower but it has the advantage that it works on all kinds of sequence as they can be efficiently izip() together. -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker wrote: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. The problem arises in coneverting lists of (x,y) coordinates into a single list of coordinates eg f([(x0,y0),(x1,y1),]) -- [x0,y0,x1,y1,] or g([x0,x1,x2,..],[y0,y1,y2,]) -- [x0,y0,x1,y1,] clearly if f is doable then g can be done using zip. I suppose this is a special case flatten, but can flatten be done fast? The python recipes seem rather slow compared to the builtin functions. how fast is fast ? for this case, is the following good enough ? def flat(li): for x,y in li: yield x yield y -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker schrieb: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. The problem arises in coneverting lists of (x,y) coordinates into a single list of coordinates eg f([(x0,y0),(x1,y1),]) -- [x0,y0,x1,y1,] or g([x0,x1,x2,..],[y0,y1,y2,]) -- [x0,y0,x1,y1,] clearly if f is doable then g can be done using zip. I suppose this is a special case flatten, but can flatten be done fast? The python recipes seem rather slow compared to the builtin functions. well then: first of all, i need to say, if speed really matters, do it in C. that being said, python can be fast, too. for this task psyco is your friend. i got this output from the script given below: without psyco: flatten1: 2.78046748059 flatten2: 2.90226239686 flatten3: 4.91070862996 goopy_flatten1: 8.22951110963 goopy_flatten2: 8.56373180172 with psyco: flatten1: 1.17390339924 flatten2: 1.7209583052 flatten3: 1.18490295558 goopy_flatten1: 1.34892236194 goopy_flatten2: 1.68568386584 the goopy function is taken from the google-functional package (but is treated a bit unfair, i must admit, being wrapped in a lambda) so, what does that show us? izip seems a bit faster than zip with these input data. you want to do your own timings with more realistic data. and all these functions are what just came to my mind, i'm sure they can be improved. hope this helps, -- David. used script: from itertools import izip xdata = range(1000) ydata = range(1000)[::-1] def flatten1(): return [x for pair in izip(xdata, ydata) for x in pair] def flatten2(): return [x for pair in zip(xdata, ydata) for x in pair] def flatten3(): res = [] for pair in izip(xdata, ydata): for x in pair: res.append(x) return res def goopy_flatten(seq): lst = [] for x in seq: if type(x) is list or type(x) is tuple: for val in x: lst.append(val) else: lst.append(x) return lst goopy_flatten1 = lambda: goopy_flatten(izip(xdata, ydata)) goopy_flatten2 = lambda: goopy_flatten(zip(xdata, ydata)) if __name__=='__main__': from timeit import Timer functions = ['flatten1', 'flatten2', 'flatten3', 'goopy_flatten1', 'goopy_flatten2'] print 'without psyco:' print for fn in functions: t = Timer(fn+'()', 'from __main__ import '+fn) print fn+':', t.timeit(5000) try: import psyco; psyco.full() except ImportError: pass print print 'with psyco:' print for fn in functions: t = Timer(fn+'()', 'from __main__ import '+fn) print fn+':', t.timeit(5000) -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker [EMAIL PROTECTED] writes: f([(x0,y0),(x1,y1),]) -- [x0,y0,x1,y1,] import operator a=[(1,2),(3,4),(5,6)] reduce(operator.add,a) (1, 2, 3, 4, 5, 6) -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Paul Rubin http://[EMAIL PROTECTED] writes: import operator a=[(1,2),(3,4),(5,6)] reduce(operator.add,a) (1, 2, 3, 4, 5, 6) (Note that the above is probably terrible if the lists are large and you're after speed.) -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Robin Becker schrieb: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. ... David Murmann wrote: Some functions and timings ... Here are some more timings of David's functions, and a couple of additional contenders that time faster on my box (I don't have psyco): # From David Murman from itertools import izip xdata = range(1000) ydata = range(1000)[::-1] def flatten1(x, y): return [i for pair in izip(x, y) for i in pair] def flatten2(x, y): return [i for pair in zip(x, y) for i in pair] def flatten3(x, y): res = [] for pair in izip(x, y): for i in pair: res.append(i) return res # New attempts: from itertools import imap def flatten4(x, y): l = [] list(imap(l.extend, izip(x, y))) return l from Tkinter import _flatten def flatten5(x, y): return list(_flatten(zip(x, y))) flatten_funcs = [flatten1, flatten2, flatten3, flatten4, flatten5] def testthem(): flatten1res = flatten_funcs[0](xdata, ydata) for func in flatten_funcs: assert func(xdata, ydata) == flatten1res def timethem(): for func in flatten_funcs: print shell.timefunc(func, xdata, ydata) testthem() timethem() flatten1(...) 704 iterations, 0.71msec per call flatten2(...) 611 iterations, 0.82msec per call flatten3(...) 344 iterations, 1.46msec per call flatten4(...) 1286 iterations, 389.08usec per call flatten5(...) 1219 iterations, 410.24usec per call Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Michael Spencer wrote: Robin Becker schrieb: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. ... David Murmann wrote: Some functions and timings ... Here's one more that's quite fast using Psyco, but only average without it. def flatten6(): n = min(len(xdata), len(ydata)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] -tim Here are some more timings of David's functions, and a couple of additional contenders that time faster on my box (I don't have psyco): # From David Murman from itertools import izip xdata = range(1000) ydata = range(1000)[::-1] def flatten1(x, y): return [i for pair in izip(x, y) for i in pair] def flatten2(x, y): return [i for pair in zip(x, y) for i in pair] def flatten3(x, y): res = [] for pair in izip(x, y): for i in pair: res.append(i) return res # New attempts: from itertools import imap def flatten4(x, y): l = [] list(imap(l.extend, izip(x, y))) return l from Tkinter import _flatten def flatten5(x, y): return list(_flatten(zip(x, y))) flatten_funcs = [flatten1, flatten2, flatten3, flatten4, flatten5] def testthem(): flatten1res = flatten_funcs[0](xdata, ydata) for func in flatten_funcs: assert func(xdata, ydata) == flatten1res def timethem(): for func in flatten_funcs: print shell.timefunc(func, xdata, ydata) testthem() timethem() flatten1(...) 704 iterations, 0.71msec per call flatten2(...) 611 iterations, 0.82msec per call flatten3(...) 344 iterations, 1.46msec per call flatten4(...) 1286 iterations, 389.08usec per call flatten5(...) 1219 iterations, 410.24usec per call Michael -- http://mail.python.org/mailman/listinfo/python-list
Re: flatten a level one list
Tim Hochberg wrote: Michael Spencer wrote: Robin Becker schrieb: Is there some smart/fast way to flatten a level one list using the latest iterator/generator idioms. ... David Murmann wrote: Some functions and timings ... Here's one more that's quite fast using Psyco, but only average without it. def flatten6(): n = min(len(xdata), len(ydata)) result = [None] * (2*n) for i in xrange(n): result[2*i] = xdata[i] result[2*i+1] = ydata[i] -tim Indeed: I added yours to the list (after adding the appropriate return) testthem() timethem() flatten1(...) 702 iterations, 0.71msec per call flatten2(...) 641 iterations, 0.78msec per call flatten3(...) 346 iterations, 1.45msec per call flatten4(...) 1447 iterations, 345.66usec per call flatten5(...) 1218 iterations, 410.55usec per call flatten6(...) 531 iterations, 0.94msec per call (See earlier post for flatten1-5) Michael -- http://mail.python.org/mailman/listinfo/python-list